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Chapter 1 

INTRODUCTION 



For four decades the evolution of integrated circuits has followed Moore’s 
law, according to which the number of transistors per square millimeter of 
silicon doubles every 18 months. At the same time transistors have become 
faster, making possible ever-increasing clock rates in digital circuits. This 
trend seems set to continue for at least another decade without slowing down. 
Thus, in the near future the processing power of digital circuits will continue 
to increase at an accelerating pace. 

For analog circuits the evolution of technology is not as beneficial. Thus, 
there is a trend to move signal processing functions from the analog domain to 
the digital one, which, besides allowing for a higher level of accuracy, provides 
savings in power consumption and silicon area, increases robustness, speeds 
up the design process, brings flexibility and programmability, and increases 
the possibilities for design reuse. In many applications the input and output 
signals of the system are inherently analog, preventing all-digital realizations; 
at the very least a conversion between analog and digital is needed at the in- 
terfaces. Typically, moving the analog-digital boundary closer to the outside 
world increases the bit rate across it. 

In telecommunications systems the trend to boost bit rates is based on em- 
ploying wider bandwidths and a higher signal-to-noise ratio. At the same time 
radio architectures in many applications are evolving toward software-defined 
radio, one of the main characteristics of which is the shifting of the analog- 
digital boundary closer to the antenna. 

Because of these trends, there is an urgent need for data converters with 
increasing conversion rates and resolution. A part of this needed performance 
upgrade comes with the technology evolution, but often the demand is higher 
than this alone can provide. Thus, there is still room, and a need, for innovations 
in circuit design. 
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The increasing integration level leads to systems with a smaller number of 
chips, the ultimate goal being a single chip solution, the system on a chip 
(SoC). This means that analog and digital circuits have to live on the same 
silicon die, which brings additional challenges in analog design, such as mixed 
signal issues and limitations in the choice of technology. Data converters are 
inherently mixed signal circuits and face the same challenges on a smaller scale 
even without going as far as SoC. Furthermore, the evolution of technology 
has been driven by the microprocessor industry and hence does not always go 
in the best direction for the analog. However, the recent rapid growth of the 
wireless telecommunications devices market has given a boost to the develop- 
ment of advanced mixed signal technologies, such as silicon germanium-based 
BiCMOS. 

The main challenges in data converter design are decreasing supply voltage, 
short channel effects in MOS devices, mixed signal issues, the development 
of design and simulation tools, and testability. In analog-to-digital converters 
(ADCs), they need to be met at the same time as the requirements for sampling 
linearity, conversion rate, resolution, and power consumption are becoming 
tighter. 

This book concentrates on low voltage issues in ADCs by searching for 
and developing techniques and circuit structures suitable for today’s and the 
future’s low voltage technologies. In parallel, the increasing demands for 
ADCs have been answered by developing high-frequency high-linearity sam- 
pling techniques and applying them to ADC prototypes, which are presented in 
the last chapter. 



Chapter 2 

LOW VOLTAGE ISSUES 



The whole history of integrated circuits has followed a trend of descending 
supply voltage. For a long time the de facto standard was 5 volts. The migration 
to a 3.3-volt supply in the mid-’90s started a trend in which almost every new 
process generation has a lower nominal supply voltage than its predecessor. 
Today the 0.25-/im generation uses a 2.5-V supply and, according to Semicon- 
ductor Industry Association’s roadmap [16], it will be scaled down to 1.2 V by 
2004 and to 0.9 V by 2008. 

Table 2.1 shows process parameters for different technology generations. 
The data, including the effective channel length, supply voltage, oxide thick- 
ness, threshold voltage, and threshold voltage matching parameter, is collected 
from real processes. The table is reprinted from [17]. 

There are two main drivers for voltage scaling: technology and power. The 
shrinking technology feature size leads to lower break down voltages and thus 



Table 2.1. Technology data collected from different processes [17]. 
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supply voltage scaling is mandatory. Due to the ever-increasing integration 
level, which aims toward a system-on-a-chip (SoC), the power dissipation of a 
single chip tends to rise, which leads to severe heat problems and increased cool- 
ing system costs. On the other hand, the rapidly-growing market for portable 
battery-operated devices, such as laptops, PDAs, and cellular phones, demands 
high signal processing capacity together with low power dissipation. 

The power dissipation of a logic gate is given by 



1. Signal-to-Noise Ratio 

One fundamental difference in analog signal processing compared to the 
digital is the significance of thermal noise [25], which sets a limit for the smallest 
distinguishable signal in the analog circuits. On the other hand, the supply 
voltage limits signal amplitude on the high side. The difference between these 
two boundaries determines the dynamic range, which is a key parameter in most 
systems. For a sinusoidal signal the peak signal-to-noise ratio is determined by 



Pdig = ^ ‘ v dd * Cl * f elk * ( 2 . 1 ) 

where Vdd is the supply voltage. Cl the load capacitance, f dk the clock fre- 
quency, and a the switching probability. It is obvious that as a result of the 
quadratic dependence, the most effective way to reduce power consumption 
is to lower the supply voltage. Although it affects the circuit speed, every 
new technology generation comes with enhanced device characteristics and 
the possibility of increasing parallelism in the logic, which together more than 
compensate for the speed loss. 

The situation regarding analog circuits is much more complicated and, as will 
be shown later, not so bright. The fundamental limits of power consumption 
in different types of analog circuits are discussed in several papers: switched 
capacitor filters in [18], continuous time filters in [19, 20 ], and data converters 
in [21]. The common finding in these papers is that there is a certain energy 
needed to present the signal and the resulting power is proportional to the signal 
or clock frequency and the desired signal-to-noise ratio, but not dependent on 
the semiconductor devices used or the supply voltage. In reality that funda- 
mental limit cannot be reached or even approached because of the limitations 
of the technology (speed, noise, parasitic capacitances, etc.) and the circuit 
topologies. Low-voltage circuit techniques for different applications have been 
compared and analyzed, e.g. for filters in [ 22 ], for analog and digital video sig- 
nal processing [23], and for various applications from sensor readout circuits 
to RF circuits in [24]. 

The remainder of this chapter concentrates on analyzing how supply voltage 
scaling affects the analog circuits, when the limitations of a CMOS technology 
are taken into account. The analysis concentrates on SC circuits and their most 
important building blocks: opamps and switches. The opamp is assumed to 
doipinate power consumption, while the speed is determined by both the opamp 
and the switches. 

To simplify the calculations, the transistor current is assumed to follow the 
square-law model, which does not describe very accurately the behavior of the 
deep-submicron transistors. The short channel effects and their impact on the 
results are, however, also discussed briefly. 



VM = VSNR max = = YEE Virgin (2 . 2 ) 

2V2 ■ V n 2V2-V n 

where V max is the maximum peak-to-peak signal amplitude and V n the rms 
noise voltage. In real circuits the signal can never go all the way from the 
negative supply rail to the positive one; thus the V marg i n in the second form 
of the equation. The required margin is highly dependent on circuit topology 
and somewhat dependent on current levels and the process parameters. It is 
typically some hundreds of millivolts at its minimum, ranging up to several 
volts at its maximum. 

Lowering the supply voltage leads to decreased signal-to-noise ratio unless 
the noise level is scaled down simultaneously. What the cost is of keeping the 
DR constant in terms of circuit speed and current consumption is discussed 
next. 

The thermal noise in CMOS circuits originates mainly from two sources: 
the resistors and the transistors. Which one dominates is circuit-dependent. In 
switched capacitor circuits the dominant noise source is typically the switch 
on-resistance and the rms noise voltage is given by 




where k is Boltzmann’s constant, T the absolute temperature, C the sampling 
capacitor, and k\ a constant dependent on circuit topology. For a given circuit 
the only way a designer can reduce the noise is to increase the capacitance. 

The equation for gate-referred MOS transistor noise has the following form: 



V n = 



- 4 jkTB 



9m 



(2.4) 



where B is the noise bandwidth, g m the transistor transconductance, and 7 
the noise excess factor. It will be shown later that in SC circuits g m must be 
scaled linearly with the capacitance C to keep the circuit speed unaffected. This 
leads to the same type of supply voltage dependency in circuit speed and power 
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consumption as with the equation (2.3) and thus this case will not be discussed 
separately. 

One hypothetical way to reduce thermal noise is cooling. However, substan- 
tial noise reduction is not obtained without extensive cooling. For example, 
cooling a circuit from room temperature (300 K) to the temperature of liquid 
nitrogen (77 K) reduces the noise by 6 dB or, alternatively, allows capacitor 
sizes to be reduced by a factor of four. 



2. Circuit Speed 

The speed of analog circuits is not usually directly dependent on the supply 
voltage. However, if the dynamic range is kept constant while decreasing the 
supply voltage, the capacitances have to be larger and thus the circuit speed 
is reduced. In addition, low-voltage circuit topologies are, in many cases, 
inherently slower than their high-voltage counterparts. Furthermore, the value 
of the parasitic drain and source junction capacitance increase as the substrate 
doping level increases and the reverse bias voltage decreases. 

In switched capacitor circuits the maximum clock frequency is inversely 
proportional to the settling time, which is determined by slew rate and opamp 
bandwidth. For a single stage opamp the gain bandwidth product is given by 



GBW 



9 m 

2ttC l 



9 m 

he' 



(2.5) 



where g m is the transconductance of the input transistor and Cl the load capac- 
itance, which can be approximated to be proportional to the sampling capacitor 
C with circuit-dependent proportionality factor £ 3 . Solving C from (2.3) and 
(2.2), and substituting it to (2.5) yields 



GBW 



9 m(VDD Vmargin) 
k s DR • k\kT 



( 2 . 6 ) 



Thus, when the settling time is dictated by the opamp GBW, the speed of an SC 
circuit decreases with the square of the supply voltage if DR is kept constant. 
It should be noted that the bandwidth loss can be compensated for by increasing 
the transconductance g m . 



Especially in moderate resolution circuits, the settling time can be dictated 
by the slew rate, which is given by 



SR = & 4 



T S 



ISR 

C L ' 



(2.7) 



where Ts is the clock period and Isr the available slewing current. Solving 
Ts and substituting Cl and V max from (2.3) and ( 2 . 2 ) yields 
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rp ^VmaxCL k 2 k±DR?kT 

Isr Isr(Vdd ^margin ) 

which indicates that the attainable clock rate decreases linearly with the supply 
voltage unless the slewing current is increased. 

3. Power Consumption 

In the previous section it was shown that scaling down the supply voltage 
causes speed loss in the SC circuits unless the g m of the opamp input transistor 
and/or the slewing current Isr is/are increased. What the power consumption 
will be, if both the circuit speed and the DR are preserved while the supply 
voltage is reduced, is analyzed next. 

3.1 Saturated MOSFET in Strong Inversion 

Let us first look at the case where the circuit speed is limited by the opamp 
GBW. The opamp is a single stage operational transconductance amplifier 
(OTA) and its input transistor is realized with a MOSFET biased in the saturation 
region. Thus, the transconductance is 



\lDfiC 0X W 



I IppCc 

1? ’ 



where Id is the drain current, (i the carrier mobility, C ox the gate oxide capac- 
itance, and W and L the channel width and length. The second form of the 
equation is written using the expression for the gate capacitance Cq = C 0X WL. 
The gate capacitance appears between the opamp input and the ground and thus 
has an effect on the transfer function. It can be shown (see Appendix B) that in 
SC amplifiers (and integrators) there exists an optimum gate capacitance Cc.opt , 
which minimizes the settling time. This optimum is proportional to the sam- 
pling capacitor: Cc,opt = k 5 C, where is a circuit-dependent proportionality 
factor. 

Now equation (2.5) for GBW can be rewritten 



GBW = 



yj fJ>k§ yj Id 

k z LyfC ’ 



Solving Id yields 



koL 2 GBW 2 C 



( 2 . 11 ) 



Using this the power consumption can be calculated to be 
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where the second form is obtained by substituting C from (2.3) and (2.2). Next 
it is necessary to ask, if Vmargin depends on \’d j j . An assumption valid in most 
cases is that Vmargin is determined by the saturation voltage V dsat = Vgs ~V T 
of the amplifier output stage transistors. An analysis carried out in Appendix C 
shows that, if the technology is fixed, Vdsat will be constant. 

So it can be seen that for a given technology (fixed L ) the power tends to 
increase when the supply voltage is decreased. 



3.2 Saturated MOSFET in Weak Inversion 

In many low-power and low-to-medium speed circuits the transistors are 
biased in the weak inversion region, where the transconductance is linearly 
dependent on the drain current and independent of the transistor aspect ratio. 
The power consumption can be calculated as before, resulting in 



k 3 k?GBW ■ DR ■ k 2 T 2 n 



(VdD V mar gi n ) 2 



where n is the subthreshold slope factor and q the charge of an electron. The 
parameter n is only slightly dependent on the bias point and the technology line 
width, being approximately 1.3 in a bulk silicon technology and 1.0 in a fully 
depleted silicon on insulator (SOI) technology [26], 



3.3 Slew Rate Limited Power Consumption 

When the SC circuit speed is slew rate limited, the power consumption is 
proportional to the slewing current and can be calculated using (2.8): 



Psc,sr oc 



hk^DR • kT • Vqq 
Ts(VdD Vmargin ) 



(2.14) 



It is interesting to note that V marg i n is the only term in the equation that has 
some technology dependency. 



3.4 Technology Impact 

How do the previous equations change when the voltage is scaled along 
with the technology? At least down to the 0.07 -/rm generation, the maximum 
supply voltage scales roughly linearly with the technology line width (see Ta- 
ble 2.1). Taking this into account by replacing the L proportionality with V DD 
proportionality, the equation (2.12) for strong inversion can be rewritten 
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PsC,scaledL ^ 



(VdD — Vmargin 



(2.15) 



The analysis in Appendix C yields that V mar gin is proportional to Vff D , m 
ranging from 1 to 1.5. Substituting this into the previous equation results in a 
curve that decreases linearly with Vdd for vn = \ and decreases even faster if 
m is larger. 



The technology scaling does not change the equations for weak inversion 
and slew-rate determined power except through V ma rgin ■ When m — 1 the 
slew-rate limited power is constant. 



What makes power reduction possible in strong inversion is the fact that 
supply voltage scaling forces one to increase sampling capacitor size, resulting, 
as a positive side effect, in a larger optimum transistor gate capacitance. At 
the same time, the scaled-down technology offers more transconductance for 
a fixed current and fixed gate capacitance. In practice, the transconductance 
up-grade is obtained by increasing the W/ L ratio, which moves the bias point 
toward weak inversion. Eventually, the transistor enters into weak inversion, 
where transconductance is not dependent on the aspect ratio and thus the power 
starts to increase according to (2.13). 



3.5 Power Consumption: Summary and Conclusions 

Figure 2.1 illustrates how the power consumption of thermal noise-limited 
SC circuits depends on the supply voltage, according to the previous equations. 
It is clear that, for a given technology, reducing the supply voltage increases 
the power consumption, the limiting factor being either opamp bandwidth or 
slew rate. When a scaled technology is utilized, with its nominal supply volt- 
age, the slew rate-limited power is independent of the supply voltage, and the 
opamp bandwidth-limited power consumption decreases with the supply when 
the transistors are in strong inversion and increases when they are biased in the 
weak inversion. In practice, the power follows the curve which is largest and, 
as a result, there is an optimum technology after which further scaling leads to 
increased power consumption. The analysis presented in [20] shows a similar 
trend in continuous time circuits. 

The analysis has not taken account of the short channel effects, the most 
important of which is velocity saturation. The carrier velocity saturates when 
the Vgs ~ Vt voltage reaches a certain value, which is several volts for long 
channel devices but decreases with technology scaling. In velocity saturation 
the transistor current does not follow the square law; instead it is given by [27] 



Id = v sat C 0X W ( V GS - V T ) ■ 



(2.16) 
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POWER vs. SUPPLY VOLTAGE 




Figure 2.1. The effect of supply voltage scaling on the power consumption an SC circuit. A 
case where the technology line width is scaled along the supply voltage is compared to a case 
where the technology is fixed. 

Consequently, the transconductance is independent of the current. As the tech- 
nology is scaled down, the strong inversion region — between the weak inversion 
and the velocity saturation — gets narrower and eventually vanishes altogether 
[28, 29]. 

In Appendix B it is shown that for a given current and fixed L the minimum 
settling time of an SC amplifier is obtained in the strong inversion region and, 
when it is absent, at the point where the weak inversion region turns into the 
velocity saturation region. In the latter case the power consumption follows the 
weak inversion curve. 

4. Matching 

In most low-to-medium resolution circuits capacitor sizes are not limited by 
thermal noise but the matching. For example, in a 10-bit pipelined ADC with 
2- V signal range the thermal noise sets the capacitor size below 0. 1 pF, which is 
considerably smaller than typical capacitor sizes in 10-bit ADCs. In the signal 
transfer function capacitor ratios multiply voltages. Thus, the errors resulting 
from the mismatch do not depend on the absolute voltage values, making pos- 
sible the scaling down of the supply and the signal range without changing the 
capacitor size. This results in a linear power reduction in opamp-bandwidth- 



limited power consumption and an even larger reduction in the slew rate-limited 
power consumption, even without technology scaling. At some point the ther- 
mal noise floor will be reached and capacitor size has to be increased. 

Another type of mismatch error is the comparator and opamp offset voltage, 
originating from transistor threshold voltage and 0 (= C 0X /iW/L) mismatch. 
In the signal transfer function the offset appears in an additive manner, thus 
causing larger relative error if the signal range is decreased. A widely accepted 
model for the threshold voltage mismatch of two transistors is given by [30, 3 1 ] 

o 2 (AV T ) = + S 2 vth D 2 (2.17) 

and for the /3 mismatch by 

= + (2 ' 18) 

In most analog circuits the last term, which depends on the distance D between 
the devices, is small compared to the first one in both the equations and thus 
can be neglected. 

In today’s technologies the threshold voltage mismatch is the dominant 
source of offset in typical analog circuits. From Table 2.1 it can be seen that 
the parameter Ayth scales linearly with the line width, and thus technology 
scaling improves the matching. The /3 mismatch, however, does not improve 
significantly with the technology, and consequently at some point it starts to 
dominate [32], putting an end to the trend of improving matching. Although 
Vt matching improves with technology, the decreasing supply voltage and the 
increasing oxide capacitance result in worse matching with a given circuit speed 
and power consumption [33, 32]. 

Offset voltages are a major concern in flash and folding-and-interpolating 
ADCs, while pipelined ADCs are fairly robust against offsets. It has to be 
remembered that since mismatch is time invariant, it is fundamentally different 
from thermal noise. Therefore, various techniques can be used to reduce, cor- 
rect, and calibrate the errors originating from mismatch. These techniques can 
be analog, digital, or mixed signal. 

5. Operational Amplifiers 

When opamps are employed as building blocks in SC circuits, they can 
almost always be used in the inverting feedback configuration, where the signal 
swing in the opamp input is very small. Thus, the input structure does not limit 
the signal range. The opamp output stage, in contrast, sees the full signal swing 
and ultimately sets the maximum limit for it. 
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Cascoded output stage Rail-to-rail output stage 



Figure 2.2. Output signal swing in different output stages. 



Single-stage opamp topologies, such as the telescopic opamp or the folded 
cascode opamp, are regarded as the fastest and most power-efficient structures 
for integrated applications. In achieving the required high DC gain they rely on 
cascoding, which limits the output signal swing. When two or more gain stages 
are used, the required DC gain can be obtained without cascoded structures in the 
amplifier output. This type of opamp, for example, the traditional Miller opamp, 
requires somewhat more power to achieve the same bandwidth as single-stage 
opamps. 

In Figure 2.2 two output stages and corresponding signal swings are shown: 
a cascoded output stage on the left and a rail-to-rail output stage on the right. 
Despite its name, the latter circuit cannot provide true rail-to-rail signal swing, 
since there is one transistor between the signal and the supply rail. This is, 
however, as close as it is possible to go: thus, the name is justified. For proper 
operation the transistors have to be in saturation, which requires the drain-source 
voltage to be at least equal to the saturation voltage Vdsau which depends on the 
current level and technology line width. In typical opamp designs the values 
are some hundreds of millivolts. In practice, in addition to the Vdsau some 
extra voltage margin has to be reserved to achieve robustness against inaccurate 
biasing and to get a decent output impedance. 

The reduction in signal swing in the cascode stage is twice that in the rail- 
to-rail stage. It is obvious that as the supply voltage gets lower the margin eats 
an increasing portion of the signal range. How this affects power consumption 
is illustrated in Figure 2.3. There the power consumption, given by equation 
(2.12), is plotted against the supply voltage for both output stages. The power 
is normalized in such a way that for the cascode circuit it is 1 at 5 V. The voltage 
margin in the rail-to-rail circuit is set to 1 .0 V and in the cascode circuit to 2.0 V. 



POWER vs. SUPPLY VOLTAGE 




Figure 2.3. Power consumption of circuits using different types of opamp. 



At 5 V the circuit with the cascode stage dissipates 78% more power, which 
will probably be compensated for in a more efficient opamp topology. When 
the supply voltage is lower, the penalty grows bigger, being 400% at 3 V, which 
undoubtedly favors the opamp with the rail-to-rail output stage. Thus, opamp 
topology has a significant effect on power consumption. Consequently, multi- 
stage opamps with rail-to-rail output stage are preferable choices in low -power 
low-voltage designs. 

6. MOS Switches 

Another crucial building block in the SC circuits is the switch. An ideal 
switch has infinite resistance when it is open and zero resistance when it is 
closed. At high supply voltages (5 V and higher), a MOS transistor has been a 
good enough approximation of that. On the other hand, the finite on-resistance 
of a closed MOS switch has already caused problems in some applications 
with a 3-V supply and the problems are believed to get worse as the supply is 
scaled down. Whether these problems are due to increased circuit performance 
requirements or to the technology scaling is investigated next. 

The on-resistance of a MOS switch can be written as 

Ron ~ 777 r 777 rrr, (2.19) 

WflC ox {Vgs - V T ) 
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Figure 2.4. Inverse of the switch on-resistance as a function of the signal voltage for an nMOS 
(a) and a CMOS (b) switch with high supply voltage and a CMOS switch with low supply voltage 

(c). 



where Vqs is the transistor gate-source voltage and Vr the threshold voltage. 
The equation is valid when Vqs > Vt\ with smaller gate- source voltages the 
resistance is infinite. To turn the switch properly on, its gate-source voltage has 
to be Vt plus some overdrive to make the on-resistance small enough. 

A single-transistor switch cannot conduct over the whole rail-to-rail signal 
range, since, for example, an nMOS switch, whose gate is tied to Vdd* cuts 
off when the signal level is raised within a threshold voltage of Vdd ■ This is 
illustrated in Figure 2.4 (a), where the inverse of the on-resistance is plotted 
against the signal level. The whole range can be covered by putting an nMOS 
and a pMOS transistor in parallel to form a CMOS switch or a transmission 
gate (Figure 2.4 (b)). The on-resistance has its largest value in the mid-range 
between the supplies, when the overdrive voltage is approximately Vdd/2~Vt- 
Thus, the maximum on-resistance, which is given by 



C 0X fiW(V DD -2V T y 



has a strong correlation with the supply voltage. 

When Vd d becomes smaller than Vr, n + Vt, p a non-conducting gap appears 
in the mid-supply range (Figure 2.4 (c)). In most applications the switch be- 
comes useless much earlier as a result of too-large on-resistance, which makes 
the settling times long. Another problem, especially in S/H circuits, is the 
signal-dependent nature of the resistance, which causes harmonic distortion 
when sampling continuous time signals. Consequently, the resistance has to be 
much smaller than in the case of a constant on-resistance. 
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What is the effect of technology scaling on the on-resistance? Because of 
increasing leakage current in digital circuits the threshold voltage cannot be 
scaled down linearly with the supply voltage, but the scaling rather follows a 
square root function. The threshold voltage values from 0.35-//m generation 
to 0.07-/im generation in Table 2.1 fit quite accurately to 0.32 ■ s/Vdd. Using 
this and assuming a linear dependence between the line width and the supply 
voltage equation (2.20) can be rewritten 



C G (1 V DD - 0.64 • y/VEo) ’ 



( 2 . 21 ) 



where Cq — C 0X WL. 

Remembering that reducing the signal range increases the capacitances, the 
correct parameter to look at, is not the resistance, but the time constant, which 



is given by 



_ . r DP 

Cg ( Vdd — 0.64 • VVdd) 



The parasitic capacitances of the switch are proportional to Cg and it is rea- 
sonable to assume that they can be allowed to grow at the same rate as the 
capacitance C. Thus, the first term in the equation is constant and the second 
one decreases linearly with the supply at high Vdd values, but starts to rise 
rapidly when the supply goes below 0.9 V. 

Methods to reduce the on-resistance can be divided into two categories: 
technology-based and circuit-based. The technology-based methods reduce the 
resistance by lowering the transistor threshold voltage, while the circuit-based 
methods increase the overdrive voltage. 

One method is the use of a dual- Ur process, which provides two types of 
transistors, with either a high or low threshold voltage. The idea of this tech- 
nology is to improve the speed of digital logic by using low -Ur transistors in 
critical places but at the same time keeping the leakage current small with high- 
Vt transistors. In analog designs the low-Ur transistors can be employed as 
switches. There are, however, at least two problems with this technology: it is 
not a mainstream technology — at least today — and so it costs more and is not 
necessarily available with analog extensions (capacitors and resistors). Circuits 
with low threshold switches may also suffer from charge leakage. 

Another technology that may alleviate the switch problem is the silicon- 
on-insulator (SOI) technology. SOI MOSFETs have inherently lower leakage 
current than their bulk CMOS counterparts and so with the same channel length 
the threshold voltage of an SOI transistor is typically smaller. The future will 
show whether this technology will become mainstream. 

The methods that improve the switch by means of circuit techniques increase 
the overdrive voltage from 0.5 Vdd ~ Vt- The switch transistor gate-source 
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SC TIME CONSTANT 




Figure 2.5. Time constant of a switched capacitor as a function of supply voltage. 



voltage can always be raised to at least Vdd without violating the technol- 
ogy specifications. A switch with a gate-source voltage of Vdd will h ave an 
overdrive voltage of Vdd ~ Vt, which is a substantial improvement on the 
earlier situation. One technique that can be used to realize this is gate-voltage 
bootstrapping. In this the voltage at the switch transistor gate is capacitively 
boosted above the supply voltage so that it follows the signal voltage with an 
offset equal to Vdd- 

Another completely different circuit technique that also results in an over- 
drive of Vdd ~ Vt is the switched opamp technique [34]. In this the SC circuit 
and the opamp are modified in such a way that all the switches can operate 
against ground or virtual ground. Consequently, the switch does not see any 
voltage swing and the overdrive is always the maximum. 

In Figure 2.5 the time constant is plotted as a function of the supply volt- 
age with two different overdrive voltages, with and without technology scaling. 
It can be seen that technology scaling yields smaller SC time constants, even 
without any special techniques, down to a supply of 0.9 V. Increasing the over- 
drive brings a significant advantage by lowering the on-resistance in all supply 
voltages and extending switch usability to lower voltages. 

Both bootstrapping and the switched opamp technique are discussed and 
analyzed more deeply in the following chapters. 



Low Voltage Issues 

7. Conclusions 

In today’s world, where digital circuits offer more speed and capacity with 
reduced power consumption with every new technology generation, the expec- 
tations for analog circuits are similar. It has, however, been clear for a long 
time that the benefits of technology scaling are not so great for them, mainly 
because of the decreasing supply voltage. During recent times the question has 
rather been whether or not the technology scaling degrades the performance of 
the analog circuits. 

The analysis of SC circuits presented in this chapter shows that there is, 
indeed, some benefit to be gained from technology scaling, at least for the next 
few technology generations. This, however, is probably not enough for the 
increased requirements and expectations that exist. Thus, there is an urgent 
need for techniques at the architectural and circuit levels for alleviating the 
problems associated with low voltage and more effectively taking advantage of 
the technology. 

At the circuit level it is of the utmost importance to maximize the signal 
swing, which has a large impact on opamps and switches. In noise-limited 
circuits, using a supply voltage smaller than the maximum allowed does not 
bring any advantage at the circuit level. When accuracy is limited by capacitor 
matching, a lower supply may be justified. 

Technology scaling is best exploited by doing things digitally. This does 
not only mean moving the signal processing functions from the analog domain 
to the digital one but also combining both techniques in realizing the system 
blocks. A good example is that of digitally self-calibrated pipelined ADCs, 
where technology scaling has made possible the incorporation of more and more 
complex digital calibration algorithms into ADCs. Similar types of techniques, 
to be used against mismatch and nonlinearity, can probably be developed for 
other applications as well in order to improve performance and permit more 
robust analog structures. 




Chapter 3 



SAMPLE-AND-HOLD OPERATION 



1. S/H Basics and Performance Metrics 

The main function of a sample-and-hold (S/H) circuit is to take samples of 
its input signal and hold these samples in its output for some period of time. 
Typically, the samples are taken at uniform time intervals; thus, the sampling 
rate (or clock rate) of the circuit can be determined. 

The operation of an S/H circuit can be divided into sample mode (sometimes 
also referred as acquisition mode) and hold mode, whose durations need not be 
equal. In hold mode, the output of the circuit is equal to the previously sampled 
input value. In sample mode, the output can either track the input, in which 
case the circuit is often called a track-and-hold (T/H) circuit, or it can be reset 
to some fixed value. In some circuits the output is held over the whole period 
of the sampling clock. This is achieved by having separate circuitry to perform 
the sampling and the holding operations. 

The most common terms and performance metrics used in conjunction with 
S/H circuits ([35, 36]) are briefly introduced in the remainder of this section. 
Which of them are more important than the others greatly depends on the ap- 
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Figure 3. /. Output waveforms of different S/H circuits. 
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Figure 3.2. Time domain S/H circuit performance metrics. 



plication of the S/H circuit. The technology utilized also has some effect on 
which parameters are usually given in circuit specifications. To fully character- 
ize an S/H circuit, specifications both in the time domain and in the frequency 
domain have to be defined. Unfortunately, some of the terms used are not well- 
established and thus the definitions in different sources may be contradictory. 

The acquisition time is the time from the command to switch from hold mode 
to sample mode to the moment when the circuit is ready to take a new sample, 
i.e. it tracks the input. Acquisition time is one of the parameters that define the 
maximum achievable sampling rate. 

The aperture time or aperture delay is the fixed time from the sampling 
command to the moment when the sample is actually taken. 

Random variation in the sampling moment is known as aperture uncertainty 
or aperture jitter. 

The hold mode settling time determines the time from the sampling moment 
to the moment when the circuit output has settled within the specified accuracy 
of its steady state value. If the S/H circuit is used in front of an ADC, the ADC 
can digitize the S/H circuit output at that moment. The hold mode settling time 
has a major impact on the maximum sampling rate of the S/H circuit. 

The signal may leak from the circuit when in hold mode. The rate of change 
in output that results from this is specified by the droop rate. 

The hold step or pedestal error is usually defined for track-and-hold circuits. 
It is the difference in the output value at the end of the tracking and during 



hold mode. The pedestal may be signal-dependent and thus produce harmonic 
distortion. 

During hold mode the signal at the circuit input may couple to the output. 
The fraction of the input signal seen at the output is specified by the hold mode 
feedthrough. 

Usually, S/H circuits have a unity gain (i.e. the amplitude of the output signal 
is equal to the amplitude of the input signal), but other gain values can be used 
as well. The gain error determines the deviation of the gain from the nominal 
value. 

The dynamic range is the difference in decibels between the maximum al- 
lowed input voltage and the minimum input voltage that can be sampled with a 
specified level of accuracy. 

Nonlinearity in the S/H circuit causes distortion. Measured with a sinusoidal 
input signal, the total harmonic distortion ( THD ) is the ratio of the sum of 
error energy in the frequencies harmonically related to the input frequency to 
the signal energy at the fundamental frequency. The THD can be given as a 
percentage or in decibels. In sampled data systems, aliasing complicates the 
identification of the harmonic frequencies in the spectrum. 

The spurious free dynamic range ( SFDR ) is the ratio of the largest spurious 
frequency and the fundamental frequency. 

The signal-to-noise ratio (SNR) is the ratio of noise energy to signal energy. 

The signal-to-noise -and-distortion ratio ( SNDR or SINAD) is the ratio of all 
eiTor energy to signal energy. Quite often the term signal-to-noise ratio is used 
although SNDR is actually meant. 

When an S/H circuit is employed in the ADC front-end it is meaningful 
to speak of resolution , which is expressed as a number of bits. Resolution is 
just another way to express the SNDR for the maximum input signal and it is 
obtained by (SNDR - 1.76)/6.02. 

2. Spectra of Sampled Signals 

An ideal S/H circuit takes samples of an input signal at uniform intervals T. 
In the time domain this corresponds to multiplying the signal by an impulse 
train 

oc 

y(t) = x(t)- (3.1) 

71 — — OG 

where S(t) represents Dirac’s delta function. The result is a train of impulses 
whose values correspond to the instantaneous values of the input signal. 
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Figure 3.3. Sampling in time domain. 



The spectrum of the sampled signal is a convolution of the input spectrum 
and the spectrum of the impulse train, which is also an impulse train 

OO i 

Y(f) = X(f )* £ =*(/-£). (3.2) 

n~ — oo 

This is illustrated in Figure 3.4, where fs is the sampling frequency and B 
the signal bandwidth. The resulting spectrum is the original spectrum plus an 
infinite number of images of the original spectrum centered at multiples of the 
sampling frequency. The figure also clearly shows that as long as the bandwidth 
of the input signal is less than half of the sampling frequency the images do 
not overlap and thus the original signal can be restored by filtering. If this 
condition — known as the Nyquist criterion — is not satisfied, a part of the image 
spectrum is aliased into the desired signal band, causing irreversible distortion. 
Because of this, the input signal usually has to be band limited before sampling 
in order to avoid the aliasing of noise and other unwanted signals present outside 
the desired signal band. In sub-sampling (or under-sampling) the aliasing is 
utilized to sample high frequency narrow-band signals. There, a signal band 
around some multiple of the sampling frequency is aliased to the baseband, 
which actually corresponds to down conversion. This can be used in radio 
receivers to digitize the intermediate frequency (IF) signal, using a relative 
narrow band ADC. In principle, the signal can be sub-sampled even at radio 
frequency (RF), but noise aliasing and sampling clock jitter limit performance 
and prevent the use of technique in most systems [37]. 

2.1 Spectrum of a Sampled and Held Signal 

In practice, the output waveform of a sampling circuit cannot be a train of 
infinitely narrow impulses. In most practical implementations the sample is 
held in the output of the circuit until the next sample is taken (Figure 3.5). In 
that case the circuit is known as a sample-and-hold (S/H) circuit. Sometimes the 
output tracks the input for half of the sample period and is held in the sampled 
value for the other half. This type of circuit can be called a track-and-hold 
(T/Fl) circuit. However, inconsistent terminology is quite often seen. 
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Figure 3.4. Spectrum of a sampled signal. 




Figure 3.5. A Sampled- and-held signal. 



If the signal processing after a S/H circuit is performed in discrete time, which 
is, for instance, the case in ADCs with a front-end S/H circuit, the spectrum is 
the ideal periodic spectrum as in Figure 3.4. On the other hand, if the output 
waveform of the S/H circuit is used as a continuous time signal the spectrum 
is different. A deglitcher preceding a DAC is a typical example of this kind of 
circuit. A similar situation can also occur when measuring S/H circuits. 

The time domain representation of a sampled-and-held signal is a convolution 
of the sampled signal (3.1) and a square pulse 



y(t) = 



x(t) 



oo 

E W-nT) 

n=— oo 





(3.3) 



where U(t/T - 1/2) denotes a square pulse from t = 0 to t = T. In the 
frequency domain the convolution corresponds to multiplication and thus the 
spectrum of the sampled-and-held signal is the spectrum of a sampled signal 
multiplied by the spectrum of the square pulse, which has the form of sin(x)/x 
Using this well-known relationship the spectrum of a sampled-and-held signal 
can be written as 



Y{f) = e- j ” fT ■ 



sin(7r/r) 

tt/T 



E *(/ 

n=— oo 




(3.4) 
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Figure 3.6 . Spectrum of sampled-and-held signal. 



A power spectrum of this form is shown in Figure 3.6. In many cases the sine 
attenuation is not tolerable and the signal has to be either predistorted before 
the hold operation or corrected after it. 

2.2 Sampling Function 

When ideal impulses (Dirac’s delta functions) are used to describe operations 
in analog continuous time signal processing, one should be on the alert. It turns 
out that it is impossible to realize a circuit performing the sampling according 
to (3.1). In practice, a circuit cannot pick the instantaneous value of its input 
signal, but rather it takes a weighted average of the input during a time window 
around the sampling moment. Mathematically, this is equal to integrating the 
product of the input signal and the sampling function from minus infinity to 
plus infinity in the time domain. For a single sample this can be written as 
follows: 



POO 

y(t 0 ) = / x(t)h(t - t 0 )dt, 
J —oo 


(3.5) 


where t 0 is the sampling instant and h(t) the sampling function, 
an infinite sequence of samples is 


The same for 


°° POO 

y(nT) = E / x(t)h{t - nT)dt 

n=-oc J -°° 


(3.6) 


OO 

= E x{nT)*h(-nT) 

71— — OO 


(3.7) 


OO 

= [x(t)*h(-t)] E S(t-nT). 


(3.8) 



n= — oo 



The integral in (3.6) can be identified as a convolution resulting (3.7). This 
can be interpreted as a sampled form of the convolution integral. In (3.8) the 
same is presented using the Dirac’s delta function. Utilizing this, the frequency 
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domain signal can be easily obtained with the Fourier transform: 

°° 7 ? 

no = [*(/) •#(-/)]* E <*(/-y)- (3.9) 

n=— oo 

This shows that in the frequency domain the effect of the sampling function is 
seen as a multiplication by the conjugate of the Fourier transform of the sampling 
function. Since the Fourier transform of an impulse is 1 the equation (3.9) is con- 
sistent with (3.2). A more realistic sampling function than Dirac’s delta func- 
tion is a triangular pulse, whose Fourier transform is sin 2 (7r/Tb/2)/(7r/T5 /2) 2 , 
where T & is the width of the base of the triangle. In general it can be said that a 
real sampling function always adds a low-pass filtering effect to the sampling 
operation. The modeling of the limited tracking bandwidth can also be included 
in the sampling function [38]. 



3. Noise Issues in S/H Circuits 



3.1 kT/C Noise 

Any sampling circuit can be considered as consisting of at least a switch and 
a capacitor. The switch always has some finite on-resistance which generates 
thermal noise. The power spectral density of this noise is the well-known 
4 kTR V 2 /Hz, where k is Boltzmann’s constant, T the absolute temperature, 
and R the resistance. The noise in the voltage sample is the resistor noise 
filtered by the low-pass circuit formed by the sampling capacitor and the switch 
on-resistance. Integrating the resistor noise spectral density weighted by the 
low-pass transfer function yields the mean square noise voltage on the capacitor 




4 kTR 



I o 1 + (f27rRCy 



4kTR 00 

2 2itRC arctan(/27ri?C) 

(2ttRC) o 



kT 

~C‘ 



(3.10) 

(3.11) 



By looking at the result it becomes obvious why this noise is often referred to 
as kT/C noise. An interesting point is that the noise voltage does not depend 
on the value of the switch on-resistance, and thus the only parameter which can 
be used to control the noise is the value of the sampling capacitor. Although 
the desired signal bandwidth is typically at least an order of magnitude smaller 
than the noise bandwidth of the sampling circuit, the sampled noise is still 
determined by (3. 1 1). This is due to the fact that the sampling operation aliases 
all the noise energy into the Nyquist band. 

In ADCs a common requirement is that thermal noise power is smaller than 
the power of the quantization noise, which can be shown to be LSB 2 / 12. This 
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sets the lowest limit for capacitor value C as follows 

kT • 12 kT ■ 12 

> LSB 2 ~ 2 ~ W vf s ' 



(3.12) 



where N is the number of bits and Vps the voltage corresponding to the ADC full 
scale. Sometimes the requirement is more stringent, allowing only 1 dB of SNR 
degradation, which changes the factor 12 in the equation to 46.3. According to 
(3.12), in the case of 1-volt full-scale voltage, the capacitor values required for 
10- and 16-bit resolution are 0.052 pF and 210 pF respectively, which indicates 
that the capacitor values for 16-bit resolution begin to be too large for practical 
integration. To overcome this, a popular solution in high-resolution applications 
is to use an oversampling ADC architecture, in which the capacitor size can be 
reduced linearly with the oversampling ratio. 



3.2 Jitter in Sampling Clock 

Random variation of the sampling instant is known as jitter. It originates 
from clock generator phase noise and sampling circuit noise. How the jitter is 
transformed to the amplitude error in the sampled voltages can be understood 
as follows: the error in the sampled voltage is equal to the change in the input 
voltage between the ideal sampling instant and the actual sampling instant. The 
voltage change in turn is proportional to the jitter and the rate of change of the 
input signal, i.e. its derivative. For a sinusoidal input the derivative is the cosine 
function multiplied by the comer frequency, which means that the voltage error 
is proportional to the frequency and the amplitude of the input signal. It can be 
shown [39] that the signal-to-noise ratio limited by jitter can be written as 



SNR = -201og(27r/At), (3.13) 

where / is the frequency of the input signal and At the rms value of the jitter. It 
can be seen that increasing the amplitude of the input signal does not improve 
SNR , since it also increases voltage error. Jitter is studied further in Chapter 8. 

3.3 Other Noise Sources 

Most S/H circuits need a buffer amplifier or an opamp, at least when in hold 
mode. The internal noise sources of the amplifier add in power to the thermal 
noise of the switch on-resistances. In passive sampling the noise is band limited 
by the RC time constant of the sampling circuit. When an amplifier contributes 
to the circuit transfer function, which is usually the case in hold mode and, in 
some closed-loop S/H architectures, also in sampling mode, its finite bandwidth 
is likely to be the dominant band-limiting factor. To reduce the amount of aliased 
noise the bandwidth of the amplifier should be kept as small as is permitted by 
the settling requirements [40]. This is important, since if the S/H circuit is 



followed by an ADC the S/H circuit noise during hold mode is also aliased due 
to the sampling performed by the ADC. 

In addition to white noise, S/H circuits also suffer from flicker or 1// noise. 
However, in high-speed applications (a clock frequency of several megahertz), 
the white noise typically dominates. This is further enhanced by noise alias- 
ing. If the 1// noise becomes a problem there are several techniques, such as 
correlated double sampling (CDS) or chopper stabilization to get rid of it [40]. 

4. Basic S/H Circuit Architectures 

In hold mode an S/H circuit remembers the value of the input signal at 
the sampling moment, and thus it can be considered as an analog memory cell. 
The basic circuit elements that can be employed as a memory are capacitors and 
inductors, of which the capacitors store the signal as a voltage (or charge) and 
the inductors as a current. In addition to the inductor, a current memory needs 
a switch that is a good short circuit when it is closed. Similarly, a switch which 
is a good open circuit in its off-state is needed for a voltage memory. Since 
capacitors and switches with a high off-resistance are far easier to implement in 
a practical integrated circuit (IC) technology than inductors and switches with 
a very small on-resistance, all sample-and-hold circuits are based on voltage 
sampling. There also exists current mode S/H circuits, but they always include 
voltage-to-current and current-to-voltage converters which allow the sampled 
quantity to be voltage. 

S/H circuit architectures can roughly be divided into open-loop and closed- 
loop architectures. The main difference between them is that in closed-loop 
architectures the capacitor, on which the voltage is sampled, is enclosed in a 
feedback loop, at least in hold mode. 

4.1 Open-Loop Architectures 

The simplest S/H circuit consists of a switch and a capacitor (Figure 3.7 (a)). 
In sample mode the switch is closed and the voltage on the capacitor tracks 
the input signal. During the transition to hold mode the switch is opened and 
the input voltage value at the switch opening moment stays on the capacitor. 
This circuit, however, is impractical since it is not capable of driving any load. 
Therefore a buffer has to be used to drive the load. An input buffer may also 
be needed to adjust the signal level to one suitable for the switch and to reduce 
hold mode feedthrough. An S/H circuit with an input and an output buffer is 
shown in Figure 3.7 (b). 

The main advantage of this open-loop S/H architecture is its high speed. Ac- 
curacy, however, is limited by the harmonic distortion arising from the nonlinear 
gain of the buffer amplifiers and the signal-dependent charge injection from the 
switch. These problems are especially emphasized with a MOS technology. 
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Figure 3.7. A simple S/H circuit (a) and a practical S/H circuit (b). 



0 





Figure 3.8. A basic closed-loop S/H circuit. 



Figure 3.9. A switched capacitor S/H circuit. 

the sampling passively, i.e. it is done without the opamp, which makes signal 
acquisition fast. In hold mode the sampling capacitor is disconnected from the 
input and put in a feedback loop around the opamp, as in the circuit shown in 
Figure 3.8. Signal-dependent charge injection from the switches is avoided by 
a technique called bottom plate sampling, which relies on special timing of the 
switch control signals. This technique is discussed in more detail in Chapter 6. 



4.2 Closed-Loop Architectures 

A well-known technique to improve linearity is the utilization of negative 
feedback. The feedback can be used internally in the buffer amplifiers in an 
open-loop architecture like the one in Figure 3.7 (b). However, this does not 
help with switch-induced distortion. The signal-dependent charge injection can 
be avoided by operating the switch at a constant potential, which can be realized 
by enclosing the switch in a feedback loop to create a virtual ground. Figure 3.8 
shows a basic closed-loop S/H circuit following this idea [41, 42]. 

As a result of feedback the output tracks the input in sample mode. The 
switch is connected to the virtual ground provided by the second operational 
amplifier and thus it introduces only a constant error charge. When the switch 
is opened the global feedback loop is broken and the input voltage is sampled 
into the capacitor Ch • The capacitor is permanently connected in a feedback 
loop around the second operational amplifier, which is used as a buffer both in 
track mode and hold mode. 

Since the feedback loop encloses two opamps in tracking mode the circuit 
has to be heavily compensated in order to avoid instability. This naturally 
reduces the speed of the circuit. Another potential disadvantage is hold mode 
feedthrough via the parasitic input capacitances of the first operational amplifier. 

A closed-loop S/H architecture, commonly used in switched capacitor (SC) 
circuits and referred to as flip-around S/H, is shown in Figure 3.9. It performs 




Chapter 4 

A/D CONVERTERS 



1. A/D Conversion 

Analog-to-digital (A/D) conversion can be separated into two distinct op- 
erations: sampling and quantization. Sampling transforms a continuous time 
signal into a corresponding discrete time signal, while quantization converts 
continuous amplitude distribution into a set of discrete levels, which can be 
expressed with digital code words. Figure 4.1 shows the principle of A/D 
conversion. 

Some A/D converter (ADC) architectures, the flash for instance, can perform 
sampling and quantization simultaneously, and in some ADCs, which are tar- 
geted for DC signals, no sampling is needed at all. In high performance ADCs, 
however, sampling and quantization are usually separated to make it possible 
to optimize the circuitry for both tasks without compromises. Furthermore, 
the performance of many ADC architectures, which do not necessarily need a 
separate sampling circuit, can often be improved by adding one. 
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Figure 4.1. Principle of A/D conversion. 
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nonlinear analog preamplification. 



The sampling operation has already been discussed in Chapter 3 and the 
architectures of S/H circuits will be investigated in Chapter 5. In the remainder 
of this chapter the quantization operation is studied in more detail and the most 
common high-speed ADC architectures are introduced. There is also a bias is 
toward those architectures most suitable for CMOS technologies. Furthermore, 
the focus is on high-speed and medium- to high-resolution (8 bits or more) 
ADCs. Oversampling ADCs are not discussed. 

1.1 Direct Quantization 

The most straightforward way to perform quantization is to compare the 
signal to a reference with a comparator. One comparison yields a one-bit result, 
telling whether the signal is larger or smaller than the reference. Thus, to get 
greater accuracy i.e. a larger number of bits, more comparisons are needed. 
For a single signal value, several successive comparisons can be made against 
different reference levels (successive approximation architecture), yielding at 
most as many bits as there are comparisons. Alternatively, with several parallel 
comparators the signal can be compared against many references at once (flash 
architecture), which gives a multibit result in one comparison phase. In the case 
of two-step flash these two methods are combined, requiring fewer comparators 
than flash ADC and being faster than successive approximation ADC. 

In these architectures quantization accuracy is, to a large extent, determined 
by comparator accuracy. The larger their number and speed requirement are, the 
more difficult their realization at a reasonable cost in area and power becomes. 
As a result, architectures using analog signal preprocessing prior to quantization 
have been developed. 

1.2 Quantization After Analog Preprocessing 

The preprocessor provides gain, which relaxes the comparator accuracy re- 
quirement. Since the signal range in ADCs is typically comparable to the 
supply voltage, a linear amplifier cannot provide significant gain. Thus, the 



preprocessor either has a piece-wise linear transfer function (algorithmic and 
pipeline ADCs) or produces several shifted nonlinearly amplified signals, out of 
which one is selected automatically (flash with distributed preamplification) or 
depending on a coarse approximation of the signal (folding-and-interpolating 
ADC). The two different types of transfer functions are illustrated in Figure 4.2. 
The preprocessing can be realized in a continuous or discrete time domain. 

Preprocessing also helps in reducing the number of comparators. This can 
be seen, for instance, from the piece-wise linear transfer function, which folds 
several incoming subranges into one output range, reducing the number of com- 
parators in the final quantization by a factor equal to the number of folds. To 
prevent information loss, a coarse quantization has to be performed prior to 
or in parallel with the preprocessing to distinguish the correct input subrange. 
The accuracy requirement of the coarse quantization can be relaxed by using 
redundancy (overlapping subranges), and thus it does not affect the final quan- 
tization. 



1.3 ADC Figures of Merit 

Some of the parameters used with ADCs are also found in the case of S/H 
circuits and hence have already been presented in the previous chapter. A more 
comprehensive presentation of ADC specifications than the one that follows 
can be found, for example, in [35]. 

The sampling rate tells how many samples the ADC can process in a time 
unit and the latency how many clock cycles there are between the sampling 
instant and the moment when the digital code is available at the ADC output. 

The accuracy of the conversion is specified by its resolution , which is given 
in bits. Often, resolution is a synonym for the number of output bits , not all of 
which necessarily carry any valuable information. Thus, the real accuracy can 
be specified by an effective number of bits ( ENOB ), which is just the signal-to- 
noise-and-distortion ratio (SNDR) expressed in bits. 

Ideally, the SNDR is limited by the finite precision of the quantization, which 
leaves an error between the original non-quantized signal and the quantized 
signal. Often, this error is treated statistically and referred to as quantization 
noise as a consequence of its uniform amplitude distribution and virtually flat 
spectral density. Quantization noise limits the SNDR (with full-scale signal) 
to 6.02 • N 4- 1.76 dB, where N is the resolution. In a practical ADC thermal 
noise, disturbing signals, and transfer function errors make the SNDR smaller. 
From this measured SNDR value the effective number of bits can be calculated: 



ENOB = 



(SNDR — 1.76) 

6.02 



(4.1) 
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Figure 4.3. Errors in ADC transfer function. 



To gain more information about how the error energy is spectrally distributed, 
spurious free dynamic range ( SFDR ) and total harmonic distortion (THD) are 
often specified. Their definition is the same as with S/H circuits, and unless 
otherwise specified they are obtained with a full-scale input signal. The effective 
resolution bandwidth (ERB) is the input signal frequency, below which the 
ENOB is less than half a bit worse than its DC value. 

The full-scale signal range is expressed in volts. From it, the voltage step 
corresponding to the least significant bit (LSB) can be calculated: LSB — 
Vfs/2 N • The static errors in the transfer function are specified with differ- 
ential non-linearity (DNL) and integral non-linearity (INL), both of which are 
referenced to the LSB. They can be either presented graphically as a function of 
the output code, or only the maximum value, which is a single number, can be 
given. The DNL is the error in the step size between two adjacent quantization 
levels, which is ideally 1 LSB. The INL is the cumulative DNL, and it is equal 
to the deviation from a straight line drawn between the end points of the transfer 
function. The DNL and INL are illustrated in Figure 4.3. In addition to these 
the ADC may also contain offset and gain error , which are not shown in the 
figure. 

2. Flash ADC 

Flash ADC, which is the fastest and one of the simplest ADC architectures, is 
shown in Figure 4.4. It performs 2 A -level quantization with 2 A - 1 compara- 
tors. The reference voltages for the comparators are generated using a resistor 




Figure 4.4. N-bit flash ADC. 



ladder, which is connected between the positive ( Vref+ ) and the negative 
( Vref ~) reference voltage determining the full-scale signal range. Together 
the comparator outputs form a 2^ — 1 -bit code, where all the bits below the 
comparator whose reference is the first to exceed the signal value are ones, while 
the bits above are all zeros. This so-called thermometer code is converted to 
N-bit binary word with a logic circuit, which can also contain functions for 
removing bit errors (bubbles). 

Since the input signal is directly connected to the inputs of the comparators, 
flash architecture is very fast; the speed is only limited by the comparators. 
Thus, the fastest reported ADCs are realized with this architecture. Flash ADC 
also has very low latency — typically one to two clock cycles- — which allows it 
to be utilized in applications using feedback (e.g. gain control loop). 

The most prominent drawback of flash ADC is the fact that the number 
of comparators grows exponentially with the number of bits. Increasing the 
quantity of the comparators also increases the area of the circuit, as well as the 
power consumption. Thus, very high resolution flash ADCs are not practical; 
typical resolutions are seven bits or below. 

Other issues limiting the resolution and speed include nonlinear input capac- 
itance, location-dependent reference node time constants, incoherent timing 
of comparators laid out over a large area, and comparator offsets. To man- 
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age the offsets the utilization of auto-zeroing comparators is often necessary. 
Alternatively, the offsets and input capacitance can be reduced by means of 
distributed preamplification combined with averaging [43, 44], and possibly 
also with interpolation [45], which can both be considered as exponents of the 
signal preprocessing discussed in the previous section. 

In averaging each comparator is preceded by a preamplifier, whose output 
is coupled to the outputs of the adjacent preamplifiers via a resistive averaging 
network. As a result, the input signal for a comparator is not produced by 
its own preamplifier alone, but it is a weighted average of the outputs of the 
preamplifiers in a small neighborhood. Comparator offset is reduced by the 
preamplifier gain and the preamplifier offset is an average of the random offsets 
of all the amplifiers participating in the amplification. 

Not every comparator needs to have a preamplifier of its own; instead, some 
(typically every other or three out of four) amplifiers can be eliminated and 
the missing signals generated by means of interpolation. Neither averaging 
nor interpolation reduces the number of the comparators, and thus it does not 
significantly extend flash architecture toward higher resolutions. 

Recently, the main application of flash ADCs has been in disk drive read 
channel circuits and local area network interfaces. Typically, six-bit resolution 
with a sampling rate of several hundred megahertz is required. Even gigahertz 
rates seem to be within the reach of state-of-the-art CMOS technologies [46, 47] . 



3. Subranging ADC 

One way to reduce the number of comparators in flash ADC is to perform 
the quantization in two phases [48]. First a coarse quantization determines the 
subrange where the signal lies and then, in the second phase, the quantization 
is performed inside this range. Consequently, the number of comparators can 
be reduced from 2 A - 1 to 2 • ( 2 N ! 2 - l), assuming that the number of bits 
determined in each phase is the same. In an 8-bit ADC this means 30 compara- 
tors instead of 255. This type of architecture is known as either two-step flash 
or subranging architecture. 

A block diagram of a subranging ADC is shown in Figure 4.5. There, the 
input signal is first sampled with a sample-and-hold circuit, which guarantees 
that both of the flashes have the same input signal. The first flash resolves the 
most significant bits (MSBs), which are also utilized for determining the coarse 
input range, according to which the reference voltages for the second flash are 
selected. A principal implementation of this selection, employing a common 
resistor ladder for both stages, is shown in Figure 4.6. For example, when the 
coarse A/D conversion tells us that the signal is between the first stage reference 
voltages Vri ; 2 and Vm,3, the taps in range 3 are used as fine references. 



Fine flash 




LSB 



MSB 



Figure 4.5. Subranging ADC. 




Fine 

references 



Figure 4.6. Subrange selection. 



Subranging architecture allows for higher resolutions than the flash method, 
thanks to the reduced number of comparators. It cannot achieve as high a 
speed because of the S/H circuit that is needed and the fact that the fine ADC 
can operate only after the coarse result is available. The architecture does not 
alleviate the comparator accuracy requirement, unless inter-stage gain is intro- 
duced. Then, however, it should rather be referred to as a two-stage pipeline. 
Adding redundant comparators, which are connected to the taps in the verges of 
the neighboring subranges, to the fine ADC, can be used to relax the comparator 
accuracy requirements in the coarse flash. 
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Figure 4.7. Folding principle and a block diagram of a folding ADC. 

The applications of subranging architecture are most often found in video 
signal acquisition, where 10-bit resolution with a 20-40 MS/s sampling rate is 
typically needed. With CMOS technology a sampling rate as high as 100 MS/s 
with 8-bit resolution has been reported in [49], while 12-bit resolution at 54 
MS/s has been achieved in [50]. 

4. Folding-and-interpolating ADC 

4.1 Folding 

Figure 4.7 shows the idea of folding. In this the input signal range is divided 
into eight subranges (A to H), at the boundaries of which the signal is folded 
up or down. As a result the output signal range is squeezed into one-eighth, 
reducing the required number of comparators by the same amount. In addition, 
folding allows signal amplification back to the full scale (not shown in the 
figure), which helps in reducing the effect of comparator offsets and noise. 

The concept of folding is very similar to subranging, the major difference 
being that folding does not need a priori knowledge of the input subrange. As 
a result, no S/H circuit is needed, which leads to faster operation. Although the 
subrange need not be known at the time of folding, it is needed in the forming 
of the final conversion result. Thus, a coarse ADC is used in parallel with the 
folding circuit, as shown in the principal block diagram in Figure 4.7. 

In practice, realizing a transfer function with the triangle wave shape is very 
difficult, since especially the sharp comers tend to become smoothed. This 
problem can be solved by producing several versions of the folded signal, each 
shifted a different amount in the x-direction, and using only the linear part of 
each curve. This is illustrated in Figure 4.8, where five nonlinear curves are 
used instead of one linear one. The little marks around the zero crossings show 
which portion of each curve is utilized. All the comparators responsible for 
detecting the signal in this range are connected to the circuit producing that 
particular curve. Often, the number of curves is increased up to the point where 



they equal the number of comparators. As a result, there is only one comparator 
per curve and it only has to detect the signal zero crossing, making the linearity 
of the curve unimportant. 

4.2 Interpolation 

The folder is a rather complex circuit, and thus the number of them should 
not be too large. Fortunately, it can be reduced by generating part of the sig- 
nals by interpolating between two signals produced by adjacent folders. In 
Figure 4.8 two of the signals (dashed lines) are generated with interpolation. 
The interpolation happens in the y-direction, as a result of which the resulting 
curve is not a perfect substitute for a real folded signal. In fact, only one zero 
crossing can be interpolated between two curves without an error. In practice, 
however, a larger number of interpolated signals can be used, when the folded 
signals are sufficiently linear and the shifting of the zero crossings is taken into 
account in the interpolation network. 

4.3 ADC Architecture 

A block diagram of folding-and-interpolating ADC is shown in Figure 4.9. 
In this four folders are followed by an interpolation network, to which the 
comparators are connected. A coarse flash ADC determines the correct input 
subrange, which is used with the comparator outputs to form the final output 
code in the logic block. Different internal signal propagation delays in the 
coarse ADC and in the folding and interpolating circuitry need to be taken 
into account to guarantee simultaneous timing in the two circuits. Adding 
redundant folded or interpolated signals increases tolerance to timing errors 
and comparator inaccuracy in the coarse ADC. 

A CMOS implementation of the folding amplifier is shown in Figure 4.10 
[51 J* It produces a four-times-folded differential current signal with five dif- 
ferential pairs. The interpolation between two folders can be realized with a 
resistor ladder. 

The folding-and-interpolating architecture can achieve high sampling rates. 
This is mostly thanks to open loop circuitry (folding amplifiers do not use 
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Figure 4.9. Folding-and-interpolating ADC. 




Figure 4.10. CMOS folding amplifier with differential current output. 



The folding-and-interpolating architecture was originally developed for bipo- 
lar technology, which is ideal for realizing accurate open-loop circuits thanks 
to good Vbe matching and high transconductance of the bipolar transistor. On 
the other hand, the offset voltages in MOS transistors are the main obstacle 
to increasing resolution. Thus, techniques such as averaging [44, 531 and self 
calibration [54] are used to reduce offset sensitivity. 

The folding can be carried out in many cascaded stages, minimizing the 
number of folds per stage [44, 53]. Consequently, the number of differential 
pairs connected to the input is reduced, which allows for the biasing of the tran- 
sistors to a larger gate- source voltage, which increases the speed via increased 
transconductance. The capacitive load is also reduced, giving an additional 
increase in circuit speed. 

The throughput of the folding-and-interpolating architecture can be im- 
proved, at the cost of latency, by using pipelining, which can be realized by 
combining cascaded folding with the distributed T/H as demonstrated in [55] 
and [54]. Both of these designs also use subranging to reduce the number of 
folders. 

The resolution of the folding-and-interpolating ADCs reported has typically 
been in the 8- 10-bit region and the sampling rates from some dozens of mega- 
hertz to a hundred megahertz. As high as a 400-MS/s sampling rate has been 
achieved [56] — with 6-bit resolution, though. The resolution of CMOS re- 
alizations has been limited to 10 bits, with one exception [54], which uses 
background self-calibration to cancel the offsets of the folding amplifiers. 

The folding amplifier structure based on differential pairs does not allow 
for very low-voltage operation, since it requires at least Vr + 2 Vdsat on top 
of the input signal swing. Consequently, many of the ADCs described in the 
references use a 5-volt supply and not a single one goes below three volts. 



feedback) and continuous time operation. The circuit has a relatively small 
silicon footprint and it can typically be realized with a digital CMOS process, 
i.e. without precision capacitors. 

4.4 Limitations and Improvements 

A well-known problem in this architecture is the fact that because of the 
folding the frequencies of the internal signals are much higher than the frequency 
of the incoming signal. As a result the performance typically begins to degrade 
at relatively low signal frequencies. The problem can be alleviated by using 
an S/H circuit in front of the converter, which, however, often diminishes the 
speed advantage. In distributed track-and-hold [52] each differential pair in the 
folding amplifier has its own preamplifier with track-and-hold, which has less 
stringent specifications than a front-end S/H circuit would have. As a result, a 
higher speed can be achieved. 



5. Pipelined ADC 

Like folding-and-interpolating ADC, pipelined ADC also uses analog pre- 
processing to divide the input range into subintervals and to amplify the signal 
inside them, as seen from the transfer function shown in Figure 4.11. The re- 
alization of the preprocessing is, however, totally different. The architecture 
has evolved by making use of the strengths of the switched capacitor technique, 
which provides very accurate and linear analog amplification and summation 
operations in the discrete time domain. As a result the sawtooth-shaped transfer 
function can readily be realized. Because of the high quality of the preprocess- 
ing it is often used so extensively that the resolution of final A/D conversion is 
only one or two bits. 

A predecessor of pipelined ADC is algorithmic (or cyclic) ADC, which 
performs the conversion (or processing) in several time steps. It uses a single 
processing unit, which works with the analog sample for a certain, say, m. 
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Figure 4.1 1. Residue transfer function of pipelined ADC. 



number of clock cycles. In pipelined ADC, m of those units — usually called 
pipeline stages — are cascaded. Each stage processes the same sample only for 
one clock cycle, after which it passes it to the next stage for further processing. 
The end result and the latency are the same as in algorithmic ADC, but the 
throughput has increased m times, being now one conversion per clock cycle. 

5.1 Pipelined A/D Conversion: Principle 

The principle in pipelined (and algorithmic) A/D conversion is to find a set 
of reference voltages whose sum equals the signal sample being converted. 
This is realized by sequentially subtracting different reference voltages from 
the sample until the residue becomes zero, indicating that the sum of the sub- 
tracted references equals the original sample value. An analogy can be found 
in weighing flour on a pair of scales using a set of weights. In ADC the residue 
is amplified between the subtraction steps in order to increase accuracy. 

5.1.1 Operation of a Pipeline Stage 

For each stage, the input signal is the output of the previous stage (VouT,i- 1 ) — 
except for the first stage, for which it is the input voltage Vin . The stage evalu- 
ates the incoming signal with a coarse A/D conversion, which yields the digital 
output code Qi. Now, as the magnitude of the input signal is roughly known, 
the closest multiple of the reference voltage (Qi • Vrj) can be subtracted from 
it and the resulted residue signal amplified by the stage gain yielding the 
output voltage 

VoUT,i = Gi • VouTj - 1 - Qi • Vr,u (4.2) 

which is used as an input for the next stage. It should be noted here that the 
coarse A/D conversion needs only to be accurate enough to prevent residue 
voltage saturation after amplification. If this is satisfied, its accuracy does not 
affect the converter’s final accuracy, unlike the precision of the amplification 
and the reference subtraction. Tolerance toward sub-ADC errors requires that 
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Figure 4.12. Pipelined ADC. 



the input range of the next stage and the set of its reference voltages have at 
least as much overhead as the largest allowed error is. 

5.1.2 Forming the Output Code 

In principle, the conversion can be continued for an infinite number of cycles, 
each of which increases the precision of the result. In reality, the result does not 
improve after some point as a consequence of component mismatch and noise. 
Once the desired number of conversion cycles has been gone through, the final 
conversion result is the sum of the subtracted reference voltages, referred to the 
input of the ADC: 

m / i — 1 \ 

Vout = Qi • Vn : \ + Ek>' w fKh (4-3) 

i = 2 \ 3 = 1 / 

where Qi is the i:th stage digital output as a bit vector and Vrj an equal length 
vector containing the stage’s reference voltages. The digital code corresponding 
to (4.3) is obtained by replacing the reference voltages with their digital values. 
Typically, the final resolution is increased by performing an A/D conversion 
(without forming a residue) for the last residue signal. This is the only sub- A/D 
conversion whose accuracy directly affects the accuracy of the conversion. 
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5.2 Pipeline Architecture 

The block diagram of the pipelined ADC is shown in Figure 4.12. It consists 
of m stages, each of which produces k bits plus one redundant bit, which 
overlaps with the bits of the next stage. The last pipeline stage is followed by 
a flash ADC, providing j bits. As a result, the final resolution N is m • k + j. 
In practice, the resolution of all the stages does not need to be equal and the 
redundant bit does not need to be a full bit (i.e. the number of possible output 
codes can be less than 2 fc+1 ). 

A functional block diagram of one stage is shown in the inset. The incoming 
voltage is sampled by the S/H circuit and simultaneously digitized by the sub- 
ADC. The result of the A/D conversion is immediately converted back to analog 
form and subtracted from the sampled-and-held signal. The resulting residue 
voltage is amplified by G { , which is nominally equal to 2 k . In a switched 
capacitor realization the S/H operation, the D/A conversion, the subtraction, and 
the amplification are all performed by a single circuit block, called a multiplying 
analog-to-digital converter (MDAC), which consists of an opamp and a set of 
switched capacitors. The low resolution sub- ADC is usually a flash, consisting 
of a few comparators and logic gates. 

A front-end S/H circuit is not necessarily needed, since the pipeline stage 
already contains an S/H circuit. When the input is a rapidly-changing signal, 
the relative timing of the first stage S/H circuit and the sub- ADC is critical and 
often relaxed with a front-end S/H circuit. 

Consecutive stages operate in opposite clock phases and as a result one 
sample traverses two stages in one clock cycle. So, the latency in clock cycles 
is typically half the number of stages plus one, which is required for digital 
error correction. For feedback purposes, where low latency is essential, a 
coarse result can be taken after the first couple of stages. The different bits of 
a sample become ready at different times. Thus, digital delay lines are needed 
for aligning the bits. 

The first monolithic pipelined ADC [57] was implemented with the switched 
capacitor technique, which has since become the standard in pipelined ADCs. 
Current mode approaches have also been tried [58, 59], but they have not been 
able to achieve the same performance. 

5.3 RSD Correction 

The simplest pipeline stage is a 1-bit stage with one redundant quantization 
level, as a consequence of which it is often referred to as a 1 .5-bit stage. Its 
transfer function is shown in Figure 4.13. The signal range both in the input and 
in the output is from —Vref to -\~Vref- Nominally, the comparator decision 
levels are set to —Vref/ 4 and -\-Vref / 4 and the ADC output codes for the 
three regions are “00”, “01” and “10”. 




sub-ADC 
output code 



00 




10 



Figure 4. 13. Transfer function of a 1 .5-bit pipeline stage. 
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Figure 4. 14. RSD correction in digital domain. 



Comparator offset can shift the decision level, as shown in the figure with 
the dashed line. Consequently, the ADC output code remains “00” instead of 
“01”. The residue, however, stays in the input range of the next stage, which 
is all that matters, as discussed earlier. The final conversion result is the sum 
of all the subtracted reference voltages. Thus, if one stage subtracts a smaller 
reference than it nominally should, the subsequent stages compensate for this 
by subtracting larger references. How this works in the case illustrated in the 
figure is shown next. 

Due to quantization error the residue now lies in the “10” region (instead of 
the “00” or “01” region) of the next stage. Since the outputs of the stages are 
summed with one-bit overlap, the second most significant bit will be correct, 
regardless of the comparison error. When the codes of the following stages are 
taken into account, the third bit also gets corrected, and so on. 

In the digital domain the correction is a simple addition, as illustrated in 
Figure 4.14. On the left is shown how the bits of the final result are obtained 
by summing the output bits of the stages with one-bit overlap. On the right the 
same bits are rearranged to show how the correction can be performed with a 
single adder. Since “11” codes are not possible (except in the last flash stage), 
the sum cannot overflow. 
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Figure 4. 15 . Switched capacitor MDAC for a 1 .5-bit pipeline stage. 



This widely-employed error correction method is referred to as redundant 
signed digit (RSD) correction, and was developed for algorithmic ADCs in [601 
and [61] and utilized in pipelined ADC in [62]. Other related methods have 
also been used (e.g. [57]). 

The redundancy allows for quantization errors as far as the residue stays in 
the input range of the next stage, which translates to ±Vref/ 4 maximum error 
in 1.5 bits/stage architecture. The errors can be static or dynamic; it is only 
essential that the bits going to the correction logic circuitry match those which 
are D/A converted and used in residue formation. 

The same correction method can easily be expanded to larger resolution 
stages as well. As a minimum, one extra quantization level is required [63], but 
for maximum error tolerance the nominal number of comparators {2 k — 1) has 
to be doubled. 

5.4 Switched Capacitor Realization 

Switched capacitor implementation of an MDAC suitable for a 1 .5-bit pipeline 
stage is shown in Figure 4. 1 5. During clock phase 0 it samples the input voltage 
into two nominally equally-sized capacitors C\ and C2. In the second phase 
the capacitor C 2 is connected in a feedback loop around the amplifier and the 
capacitor C\ to one of the three reference voltages, according to the output of 
the stage’s sub- ADC. The resulting output voltage is given by: 

Vout = C 2 * Vin + Q • 77- * Vref = 2 Vin + Q * Vref, (4.4) 

O2 c 2 

where Q is the ADC output code with possible values —1,0, and +1. The 
second form of the equation, which is consistent with (4.2), is obtained with 
equal capacitor values. 
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Ifthereis mismatch in the capacitor values, Ci being ( 7 + Ap and C 2 ( 7 — Ap, 
the gain will be 2 + instead of 2 and the reference voltage (l + j Vref 
instead of Vref ■ Similarly, a finite opamp gain will reduce the gain to 2 • 
(l — \ an d the reference voltage to (l — j — pfj) Vref , where A 

is the opamp gain and C in the capacitance from the negative opamp input node 
to the ground. Depending on the opamp topology, Ci n may be much larger than 
the physical capacitor because of the Miller effect. 

The effect of finite opamp bandwidth can be analyzed by replacing A with 
A/ (1 + 5), which assumes a single pole opamp model. The bandwidth re- 
quirement is discussed in more detail later in this section. 

The voltage Vref is a global reference, common for all ADC stages. Thus, 
any deviation from the nominal value will be seen as absolute gain error, which 
is usually not harmful. In general, a DAC with two levels (0 and Vref for 
instance) is always linear. In a fully differential circuit a third level can be added 
by defining the reference as the difference between two voltages {Vref A- and 
Vref—)* to which the capacitors in the positive and negative half circuit are 
connected. Two levels are obtained by alternating the polarity of the connection 
and the third — the zero — by shorting the capacitors together. 

In higher resolution pipeline stages, more than three DAC levels are required. 
They can be obtained by generating more reference voltages with a resistor 
ladder, but then the circuit is not any more insensitive to the values of reference 
voltages. A better and more commonly used solution is to split the capacitor 
Q into several pieces, which can be independently connected to the references. 
The advantage stems from the fact that capacitor matching is typically better 
than resistor matching. A generic MDAC, thus, consists of a capacitor C 2 , 
equal to ( 7 , and a capacitor Ci, which is constructed of G - 1 pieces, each equal 
to C. 

Several techniques for achieving resolutions higher than what is permitted 
by matching have been developed, from which self-calibration techniques are 
covered later on this section. The reference feedforward technique [64] and 
commutated feedback capacitor switching (CFCS) [65, 66] improve the DNL, 
but do not affect the INL. In 1 -bit/stage architecture the capacitive error averag- 
ing technique, which has previously been used in algorithmic ADCs [67], can 
be used [68, 69]. With it, a virtually capacitor ratio-independent gain-of-two 
stage can be realized. The technique, however, requires two opamps per stage 
(a modification, which does not, has been proposed in [70]) and needs at least 
one extra clock phase. 

5.5 Scaling 

Some of the error sources in different stages, opamp gain for instance, are 
clearly correlated. Thus, when referred to the ADC input, they add up linearly 
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resulting 

Vs, tot = V £) 1 + -^-Ve,2 + r r ~VeA 4 ( 4 - 5 ) 

Cri Cr\Cr2 

The first stage error dominates, but when the stage gain is low, errors in the 
latter stages also make a significant contribution to the total error. For example, 
in 1.5 bits/stage architecture the total error is about 2V e , when the errors in the 
stages are equal (V £ ). In a stage with two effective bits (gain 4) the factor 2 
reduces to 1.3. 

Thermal noise and random capacitor mismatch in different stages can be con- 
sidered uncorrelated. Consequently, their total effect can be found by summing 
the squares of the input referred voltages yielding 




Now the total noise in the 1.5 bits/stage architecture is only about 1.1514 and 
practically totally determined by the first stage if stages with higher gain are 
used. 

The stages need not be identical. In fact, considerable savings in power 
and silicon area can be achieved when the capacitors in the latter stages are 
scaled down [71], which relaxes opamp specifications and reduces the circuit 
area. The scaling requires budgeting a larger amount of the total noise (and 
capacitor mismatch) to the latter stages, which can, however, easily be done 
without making the first stage specifications much more stringent. 

In principle, the noise contribution of every stage can be made equal, which 
is obtained with the scaling factor G 2 . It would, however, increase the first 
stage specifications too much. Thus, a more reasonable scaling factor would 
be, for instance, G, except in the 1.5 bits/stage architecture, where even a 
smaller scaling factor may be optimal. Analysis in [72] shows that, from a 
power consumption point of view, the optimal scaling factor is between G and 
G 1 * 5 ; the larger the stage resolution, the larger the optimal scaling factor. 

It is possible also to relax opamp gain and bandwidth in the latter stages, 
but not as much as capacitor size. Designing every stage with individual spec- 
ifications is often too laborious. Instead, using two or three differently-sized 
stages is a good compromise. Scaling brings the largest advantages to the very 
first stages, whose requirements are the toughest. Thus, the efforts should be 
focused there. 



5.6 Per-Stage Resolution 

What is the best stage resolution? The smaller the stage resolution is, the 
larger the total number of stages becomes. On the other hand, each additional 
bit in a stage doubles the number of its comparators and halves the tolerable 
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comparator offset. Furthermore, the sub-ADC input capacitance is proportional 
to the number of comparators. Thus, the exponential growth in the number of 
comparators sets an upper limit for the practical stage resolution, which seems 
to be around five bits. 



5.6.1 Opamp DC Gain 

When it comes to circuit speed and power consumption, the opamp is typi- 
cally more important than the comparators. Thus, the effect of stage resolution 
on it is investigated next. The amount of sampled thermal noise is determined 
by the size of the sampling capacitor. To keep the noise the same regardless of 
the stage resolution, let the total sampling capacitance ( Ct 0 t ) be fixed. Thus, 
C tot = G * C, which leads to decreasing unit capacitor C for increasing stage 
resolution. 

The output voltage of an MDAC in the presence of finite opamp gain can be 
written as 

( C CC \ 

1 _ yl _ C t 'tAj ’ 



The maximum error occurs, when the output voltage should be at its nominal 
maximum {^fVref), which is obtained, for example, when Q = — G + 1 and 
Vjn = Vref • Then the corresponding input referred error voltage is 





GC in \ 

Ctot a) 



Vref 5 



(4.8) 



which can be seen to be independent of the stage resolution [74]. 

Using (4.8), the required opamp DC gain in an A -bit pipeline ADC can be 
derived. The full-scale input range is 2 Vref, making the LSB step equal to 
Vref /2 Ar_1 . A common requirement is that the error is smaller than the max- 
imum quantization error, i.e. 0.5 LSB. When neglecting the input capacitance, 
the requirement becomes A > 2 N . In decibels the same is given by 



20 log (A) > 6.02 ■ N dB. 



(4.9) 



In practice, the finite input capacitance slightly raises this requirement. When 
the effects of many stages are combined according to (4.5), a weak dependence 
on stage resolution is introduced, requiring, for instance, an opamp DC gain in 
a gain-of-2 stage 3.6 dB larger than in a gain-of-4 stage. 



5.6.2 Capacitor Matching 

Capacitor matching is typically proportional to the square root of capaci- 
tor size. Since the stage gain is inversely proportional to the unit capacitor 
size, increasing the stage resolution, which makes the unit capacitor smaller, 
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Opamp GBW vs. Stage Resolution 




Figure 4. 16. Required opamp GBW as a function of stage resolution. 



reduces the input-referred error. Thus, a large resolution stage is less sensitive 
to capacitor mismatch [73]. 

5.6.3 Opamp Bandwidth 

The opamp bandwidth (together with the slew rate) determines the settling 
time. In Appendix A the GBW requirement for an OTA in an SC amplifier 
has been derived. According to this, settling to AT-bit accuracy within time T 
requires 



GBW > 



(N - k) ■ In 2 • (2* 



Ctot 

Cl 

2ttT 



Ctot 

2 k -C L 



+ 



C zn (2 k >C L +Ctot) 

CtotCL 



(4.10) 



where Cl is the opamp load capacitance without the MDAC capacitors. The 
GBW is defined with the OTA transconductance and the load capacitance, 
resulting GBW = gm/2nCL- Assuming that the load is primarily caused by 
the sampling capacitance of a similar MDAC and C in is equal to C to t{2 k - 
l)/2 k (see Appendix B), the term in parentheses becomes 2* :+1 + 1 — 2~ k+l . 
Thus, the required opamp bandwidth depends almost exponentially on the stage 
resolution. This is illustrated in Figure 4. 16, where (N - k)( 2 k+l + 1 — 2 _/c+1 ) 
is plotted against k 9 with N values 8, 10, 12, and 14. 



When the ADC sampling rate has to be maximized without capacitor scaling, 
1.5 bits/stage architecture is clearly the best choice [74]. In contrast, when 
capacitor scaling is used and other specifications, such as power consumption 
and area, are also important, higher stage resolution is often preferable. 

Increasing the stage gain decreases the feedback factor of the circuit, which 
leads to a higher opamp bandwidth requirement. Since unity gain stability is 
not needed from the opamp, the reduced feedback factor somewhat helps in 
fulfilling the increased bandwidth requirement. Higher stage gain also makes 
possible more aggressive capacitor scaling in the latter stages, which reduces 
both the total area and power consumption as well as the capacitive load to the 
previous stage. 

Let us first consider the case where capacitor scaling is not performed. In a 
design space where the technology limits have not yet been reached (i.e. OTA 
bandwidth can still be increased), OTA power consumption is approximately 
proportional to the square of g m , on which the GBW linearly depends. When 
increasing stage resolution from one effective bit (k = 1) to two bits (k = 2), 
the GBW has to be increased by a factor of roughly 1.9, which corresponds to 
an increase in opamp power consumption by a factor of 3.6. Since the number 
of stages is simultaneously halved, the total power consumption increases by 
a factor of 1.8. Further increasing stage resolution makes the penalty bigger. 
Thus, 1-bit architecture seems also to be optimal for low power when capacitor 
scaling is not used. Only if opamp power consumption is dominated by the slew 
rate instead of the GBW requirement may a higher stage resolution be justified 
[5]. 

Let us now scale down the capacitors by a factor of 2 k between the stages, 
when going toward the end of the pipeline. As a result the following stage loads 
the MDAC with a sampling capacitance equal to C tot /2 k . Let us assume for a 
moment that the opamp’ s load is dominated by this capacitor. The opamp GBW 
will be 2 k • gm/Ctou while the GBW required by the settling is proportional 
to (N - k) • ( 2 k+ 2 - 3). Thus, the requirement for the opamp g rn stays almost 
constant. This suggests that maximizing stage resolution , which minimizes the 
number of stages, leads to minimum total power consumption, and the speed 
is rather independent of stage resolution. More aggressive scaling increases 
the advantage of high stage resolution, as seen from Figure 4.17, where the 
required g m as a function of stage resolution in a 12-bit ADC is shown. 

In practice, however, besides the next stage sampling capacitor the opamp 
load comprises its own output capacitance, the input capacitance of the next 
stage comparators, and the wiring capacitance as well. The part of the GBW 
determined by them does not improve without increasing the opamp current. 
Furthermore, increasing stage resolution increases the number of comparators, 
which partially cancels the effect of scaling. In [75] stage resolution 2 or 3 bits 
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g vs. Stage Resolution 

m 




Figure 4.17. Relative opamp transconductance in a 1 2-bit pipelined ADC as a function of stage 
resolution using two different capacitor scaling factors. 

was found optimal for low power. The analysis, however, was made without 
going to the circuit level. 

To summarize, it is advantageous to use high stage resolution when the 
sampling capacitor is large, clearly dominating the opamp load [72]. This is 
the situation in the front stages of a high resolution ADC [76, 77, 73], Since 
1 .5 bits/stage architecture does not permit extensive scaling, a higher speed can 
be achieved with 2-bit or 3-bit stages. 

5.7 Calibration 

RSD logic takes care of comparator errors, but errors in the stage gain G 
and the reference levels produced by the sub-DACs remain in the output code. 
These errors are primarily dependent on capacitor matching, which is typically 
sufficient for 1 1 to 12-bit resolution. If, for some reason, the matching is worse 
or a higher resolution is pursued, some form of self-calibration [78] or trimming 
is often used. 

Two approaches can be taken to calibrate out the errors: mixed signal or 
fully digital. In mixed signal calibration, the erroneous component values are 
measured from the digital output and adjusted closer to their nominal ones [79], 
which requires the capacitors in the MDACs to be adjustable. Alternatively, a 



calibration DAC can be used to sum calibration coefficients to the MDAC input 
[78]. Both methods apply the correction to the analog signal path, and thus 
require extra analog circuitry. 

In the fully digital approach the component values are not adjusted [80, 81, 
63]: they are just measured and used as they are. The idea behind this can 
be understood by looking again at the equation (4.3) for the conversion result. 
If the stage gain or the reference voltages deviate from their nominal values, 
it is not considered as an error: they are just unknown. Measuring them and 
using the measured values instead of the nominal ones in the formation of the 
ADC output code yields a perfectly correct result. The accuracy of this method 
depends on the accuracy of the measurement. 

Pipeline architecture has been found very suitable for calibration [79, 81, 
63, 82, 83, 77, 84]. The number of components to be calibrated is sufficiently 
small, since only the errors in the first few stages are significant as a result of the 
fact that, when referred to the input, the errors in the latter stages are attenuated 
by the preceding gain. Furthermore, no extra ADC is necessarily required for 
measuring the calibration coefficients, since the back-end stages can be used 
for measuring the stages in front of them. 

In the next equation the residue after the last pipeline stage is written on the 
left and the corresponding output of the final flash ADC on the right. 

m m j m \ 

v/iv-n^-E ['Qi-vFi- n G A=D m+ i (4.1D 

i— 1 i— 1 \ j—i+l ) 

The first term on the left is the input voltage, amplified by the total gain of 
the pipeline. The next term is the sum of the subtracted reference voltages, 
amplified by the gains of the subsequent stages. It can be seen that an error 
in the gain terms alone affects only the magnitude of the input voltage, which 
is usually not harmful. Thus, the gain terms can be left uncalibrated, which 
makes possible the realizing of the calibration algorithm without multipliers. 

As stated earlier, only the errors in the foremost stages are significant in 
practice. Let us assume that only the two first stages have errors and all the 
following stages are error-free. The residue after the second stage equals the 
digital output produced by the back-end stages: 

V in G x G 2 — Qi • Vfyi * G 2 - Q 2 * Vr ,2 = DlsB) (4.12) 

Isolating stage 1 from the rest of the pipeline and setting the stage 2 input voltage 
to zero and forcing Q 2 to go through all the possible bit codes, the unknown 
reference voltages Vr^ can be measured with the back-end. After that the 
stage 1 reference voltages Vr, 1 * G 2 can be measured in the same manner, the 
calibrated stage 2 now being a part of the back-end. It should be noted, that 
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the calibration automatically takes into account the unideal gain (G 2 ) of stage 
2 as well as the gain of the back-end. If more stages need to be calibrated, the 
procedure can be started at any point in the pipeline. 

The calibration has limited accuracy, even with an error-free back-end, since 
the measured calibration coefficients cannot be more accurate than the resolu- 
tion of the back-end ADC permits. When, during the normal operation, a term 
corresponding to a measured reference voltage is added to the back-end bits, 
the truncation error in the conversion result and in the measured value add up. 
In practice the number of cumulated errors is much larger, since the number of 
summed measurements needed for the determining of a single reference voltage 
in a switched capacitor MDAC is, in worst case, proportional to 2 k , where k is 
the resolution of the stage [77]. 

To improve accuracy, the resolution of the back-end is typically enhanced by 
two or three bits. The calculations are performed with the enhanced resolution, 
after which the output is truncated to its final accuracy. 

The calibration cycle to determine the coefficients is typically performed at 
startup. This is not always sufficient, since the component values may drift over 
time and change with temperature and supply voltage. Thus, the calibration 
has to be repeated from time to time, which requires suspending the normal 
operation of the converter. In many systems the input signal contains idle pe- 
riods, which can be used for calibration. This is not always possible and the 
calibration has to be performed in the background. It can be done by having 
redundant circuit blocks (e.g. an extra pipeline stage [85]), so that when one 
element is being calibrated it is taken offline and replaced by another. Alterna- 
tively, some input signal samples can be substituted with digitally interpolated 
values and the freed clock cycle used for calibration [86, 77]. Running the ADC 
at a slightly higher clock rate than the front-end S/H circuit and queuing the 
sampled voltages is another way to free clock cycles for calibration [87]. 

If the reference levels in the MDAC are realized with resistor string instead of 
the more common capacitor array, the reference voltages can be measured with 
a separate calibration ADC in the background without disturbing the operation 
of the ADC [88]. Then, however, the gain error of the back-end stages is not 
automatically taken into account. 

In [89], a dynamic element matching (DEM) DAC is used in the pipeline 
stage and the DAC elements are measured directly from output bitstream with 
a correlator during the normal operation. 

6. Time-Interleaved ADC 

Figure 4.18 shows the block diagram of an architecture in which four ADCs 
are used in parallel to achieve four times the sampling rate of a single converter. 
This is often known as time-interleaved architecture [90], since the operation 
of the ADC channels is interleaved in such a way that one channel processes 
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Figure 4.18. Four-channel time-interleaved ADC and its clock signals. 



every fourth sample. The digital outputs of the channels are combined with a 
multiplexer to a single full-speed bitstream. 

Time interleaving was used for the first time with pipelined ADCs in 1993 
in a four-channel [91] and in a two-channel ADC [92]. Since then several two- 
channel parallel pipeline ADCs have been published: e.g. [93, 94, 95, 96, 97 
98,99,100]. 

6.1 Problems and Solutions 

The problems of time-interleaved architecture arise from channel mismatch. 
The channels may have different offset voltages, their absolute gains can be 
different, or there can be a constant skew in the clock signals [90]. How these 
errors are seen in the spectrum of the sampled signal will be discussed in con- 
junction with the double-sampling technique in Chapter 9. 

Up to a certain resolution, component matching is good enough and the 
errors can be kept to a tolerable level with careful design. High-resolution 
time-interleaved ADCs, however, without exception use different techniques to 
suppress errors. 

The offset can be rather easily calibrated using a mixed signal [99] or all- 
digital circuitry [91, 5]. Calibrating the gain mismatch is also possible, but 
requires more complex circuitry than offset calibration [97, 98]. The timing 
skew may originate from the circuit generating the clock signals for differ- 
ent channels or it may be due to different propagation delays to the sampling 
circuits. Skew can be most easily avoided by using a full-speed front-end 
sample-and-hold circuit [92, 96, 5], as illustrated in Figure 4.19. The ADC 
channels resample the output of the S/H when it is in a steady state, and so the 
timing of the channels is not critical. The only problem is that the S/H circuit 
has to be very fast, since it operates at full speed. 

The errors caused by timing skew can be corrected with digital post-processing 
if the signal bandwidth is limited to be somewhat smaller than the Nyquist band 
[101]. The technique, however, requires measuring the actual skew, even with 
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Figure 4.19. Time-interleaved ADC with front-end S/H circuit. 

sub-picosecond accuracy, which is not trivial. As an alternative to the digital 
method, the skew can be compensated with adjustable delay elements, which 
also requires measuring the skew. 

7. A/D Converters: Summary 

For very low resolution (5 bits or below), flash architecture is typically the 
best choice. It is clearly the fastest architecture and can be scaled up to a 
6— 7 -bit resolution range. Resolutions from 5 to 10 bits can be covered with 
subranging or folding and interpolating ADCs or with an architecture which is 
a combination of these. The speed achieved is not as high as with flash ADC, but 
typically somewhat higher than with pipelined ADC. Subranging and folding- 
and-interpolating ADCs can be realized without linear precision capacitors and 
the circuit area is typically quite small. 

Pipelined architecture can be used in a very wide resolution range, typically 
from 8 to 12 bits without calibration and up to 15 bits with calibration. The 
interstage gain makes it possible to scale the components along the pipeline, 
which leads to low power consumption. In addition, the switched capacitor 
technique — with some modifications — has shown itself to be capable of very 
low-voltage operation [102, 3], This will be demonstrated with two prototypes 
[8, 3] in Chapter 12. 



Chapter 5 

S/H CIRCUIT ARCHITECTURES 



This chapter gives a brief description of the S/H architectures found in recent 
publications. Although the focus is on CMOS implementations, the most im- 
portant bipolar architectures are also presented. The examples will show that 
the differences between the bipolar and the MOS device are highlighted in the 
design of S/H circuits. This leads to very different architectural solutions in 
high-performance designs. 




Figure 5.7. Simplified schematic of an S/H based on diode bridge . 
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1. Bipolar Architectures 

1.1 Diode Bridge 

Traditionally, the high-speed S/H circuits implemented in GaAs technology 
employ a diode bridge as the switching element. Since a good quality diode is 
often also available in silicon bipolar technology, the same type of architecture 
can be used. Figure 5.1 shows a simplified schematic of an S/H circuit based 
on the diode bridge switch. In tracking mode elk is high and the current I\ 
flows through the diode bridge. Since the impedance of a forward-biased diode 
is very low, the output voltage across the capacitor C is almost the same as the 
input voltage. The circuit is turned into hold mode when elk goes low and elk 
goes high, steering the bias current past the bridge. The high impedance of the 
reverse-biased diodes virtually isolates the output from the input. In a practical 
circuit, at least the output has to be buffered and, usually, some additional diodes 
are needed to establish the reverse bias conditions. 

The major disadvantage of the diode bridge S/H is the limited signal swing 
when operating with low supply voltages (3.3 V or less). Also, high quality 
diodes are not always available in bipolar technology. 

A low voltage S/H circuit employing a single diode as a switch is reported 
in [103]. The circuit provides a 1.5 Vpp signal swing with a 3 V supply. 

1.2 Switched Emitter Follower 

Most bipolar S/H circuits in recent publications rely on a switched emitter 
follower as a switch and are usually implemented using npn transistors only. 
The architecture was first introduced in [104] and its schematic is redrawn 
in Figure 5.2. Both halves of the differential circuit consist of a switch and 
an output buffer. The input is brought to the switches through a differential 
input buffer, whose linearity is one of the major concerns in this architecture. 
The linearity is improved by adding emitter degeneration resistors in the input 
differential pair (Q1 and Q2) and diodes (Q3 and Q4) in series with the load 
resistors. 

In track mode the buffered input is sampled in the capacitor C]q through the 
emitter follower Q5. In transition to hold mode the bias current of the emitter 
follower is turned off and its base is pulled down. The minimum size of the 
sampling capacitor Cjq is limited by the droop rate of the held output. To make 
the droop smaller, the bias current of the first emitter follower (Q8) in the output 
buffer is turned off in hold mode. The droop in the differential output signal is 
considerably smaller than in the case of single output, since it is mostly common 
mode. 

In hold mode the signal at the output of the input buffer couples to the hold 
capacitor through the parasitic base-emitter capacitor of Q5. The feedthrough is 
minimized by connecting a feed-forward capacitor Cl from the other output of 
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Figure 5.2. S/H architecture using emitter follower switch [104], 



the input buffer to the hold node. This capacitor is implemented by employing 
the base-emitter junction capacitance of a BJT. 

While a 120-MS/s sampling rate was achieved in [104] it was increased to 
1.2 GS/s in [105], mainly through the use of a more advanced technology. 
However, the increased speed was paid for by decreased linearity (from 10 
bits to 8 bits) and increased power consumption (from 40 mW to 460 mW). 
In [106] some of the same authors managed to restore the resolution to 10 
bits while maintaining the previous power consumption. This was achieved 
by modifying the input buffer and adding a droop compensation circuit into 
the output buffer. A 10-bit 250-MS/s S/H circuit with different buffers and a 
different hold-feedthrough cancellation technique is presented in [107]. While 
all the previous designs require a 5-V supply, in [108] both the input and the 
output buffers are redesigned so as to allow operation with a 2.7-V supply 
voltage. 

Another low voltage (3.3 V) architecture is presented in [109]. There, the 
need for an input buffer is eliminated by using series type sampling, as in many 
CMOS architectures. The THD, measured with a 10-MHz input signal, was 
60 dBc at a 100-MHz sampling rate. The quasidifferential (two single-ended 
circuits in parallel) circuit provides a differential signal swing as high as 3 V 
from a 3.3 V supply. 

2. CMOS Architectures 

One of the main challenges in bipolar S/H design is the lack of a good simple 
switch. The MOSFET is almost an ideal switch. When operating in the triode 
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Figure 5.3. A simple S/H circuit employing a source follower buffer. 



region it can be considered as a voltage-controlled resistor. The capacitive 
coupling over an open MOS switch is typically small and thus the hold mode 
feedthrough is less a problem. In addition, the purely capacitive impedance 
seen from the gate of a MOSFET allows the storing and buffering of sampled 
charges for a long time period without a significant droop. 

In bipolar S/H circuits the problems are mainly connected with the linearity 
the buffer circuits and hold mode feedthrough via the junction capacitances of 
the switch device. However, by paying great attention to buffer design and using 
different linearization and feedthrough cancellation techniques, it is possible to 
make high-performance S/H circuits with open loop buffers. In CMOS the lower 
transconductance of the MOS device and the body effect make the simplest open 
loop buffer, the source follower, much less linear than its bipolar counterpart, 
the emitter follower. 

A well-known technique used to linearize circuits is to utilize negative feed- 
back. For example, an opamp connected to unity gain feedback makes a very 
linear buffer. The use of feedback does not need to be restricted to the buffer 
only. Enclosing the sampling capacitor in the feedback loop reduces the ef- 
fects of nonlinear parasitic capacitances and signal-dependent charge injection 
from the MOS switches. Unfortunately, an inevitable consequence of the use 
of feedback is reduced speed. 

The tradeoff between speed and linearity has caused researchers to take two 
different approaches to the design of high-speed high-resolution CMOS S/H 
circuits. One is to use an open loop architecture and make an effort to maximize 
the linearity and the other is to employ a closed loop architecture and maximize 
the speed. 

2.1 S/H Circuit with Source Follower Buffer 

Figure 5.3 shows a simple S/H circuit using the source follower buffer. Ide- 
ally, the channel current of a MOS transistor depends only on the gate-source 
voltage of the device. Consequently, a MOS transistor biased with a constant 
current provides a constant voltage shift from the gate to the source. The cir- 
cuit has purely capacitive input impedance and low output impedance, and thus 
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Figure 5.4. Simplified schematic of the S/H circuit presented in [ 1 10]. 



it seems as if it is an ideal solution for buffering a charge stored in a capaci- 
tor. However, there are two different nonidealities that introduce input voltage 
dependency into the gate-source voltage. These are the bulk effect, which is 
the channel current dependency on the source-bulk voltage, and the finite out- 
put resistance seen when looking at the transistor from the drain. The output 
impedance of the current source employed in biasing the source follower also 
has an effect. The output impedance is inversely proportional to the channel 
length of the transistor and thus it has an increased importance in short channel 
MOSFETs. 

Probably the only way to get rid of the bulk effect is to connect the source and 
the bulk of the transistor together. This requires that the transistor can be put 
in a well of its own, which is possible only with pMOS transistors in a typical 
CMOS process, which uses a p-type substrate. The penalty incurred by using 
a pMOS transistor is the slower speed in comparison to an nMOS solution. 
A S/H circuit employing an nMOS source follower buffer implemented in a 
non-typical CMOS process is presented in [110]. An implementation using a 
standard CMOS process and pMOS source follower is reported in [111]. The 
S/H circuits from [110] and [111] are shown in Figures 5.4 and 5.5 respectively. 

In Figure 5.4 the effect of the finite output resistance of the source follower 
transistor M2 is reduced by making its drain-source voltage almost constant 
by cascading it with the transistor Ml. To keep M2 in saturation its effective 
threshold voltage is made larger than the threshold voltage of M 1 by biasing 
its bulk with the diode-connected transistor M3. The output impedance of the 
bias current source is enlarged by cascoding the current source transistor M5 
with the transistor M4. As a result of the five stacked transistors the circuit 
requires a 6-V supply voltage and the signal swing is still limited to 800 mV. 
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Figure 5.5. S/H utilizing a linearized source follower buffer [111]. 



The circuit achieves roughly 60-dBc linearity at a 100-MHz clock rate with a 
10-MHz signal frequency. 

A lower supply voltage (~ 3 V) can be used with the circuit presented in 
Figure 5.5. The drain-source voltage of the source follower Ml is bootstrapped 
with a circuit consisting of M2 and R. Although this is not explained in the 
reference, the resistor R is used instead of a current source, probably in order 
to minimize the supply voltage. If a current source was used, connecting the 
gate of Ml through SI to the ground would cause the current source to drop 
from saturation, which would slow down the transition to hold mode. To keep 
the current source in saturation, S 1 should be connected to a higher potential, 
increasing the required supply voltage and circuit complexity. The drawback 
of using the resistor is the relatively high current flowing through M2. This 
is due to the fact that the bias current I determines the minimum voltage on 
the resistor R, which cannot be very high, in order to keep Ml in saturation 
when its gate is connected to the ground. The current provided by M2 should 
generate a voltage variation on the resistor R which is equal to the signal swing. 
In practice this means that the average current through M2 during hold mode 
can be larger than the current I. On the other hand, R must be much larger than 
1 / 9 m 2 to make the drain-source voltage of M l constant enough. It might be 
impossible to set the value of R so as to satisfy both these and the requirements 
set by the desired operating point. 

In both circuits voltage-dependent charge injection from the switch transis- 
tors is avoided by taking the sample by opening the switch S 1 slightly before the 
input switch S2. Since S 1 is connected to a constant potential in both circuits, 
the charge it injects to the sampling capacitor is constant. Because the capacitor 
is floating when S2 is opened, its charge injection cannot distort the sampled 
voltage. This technique is called bottom plate sampling and it is discussed in 
more detail in Chapter 6. Since the left terminal of the capacitor is connected 



to a low impedance in hold mode, the attenuation to input signal feedthrough 
is very high. 

Although distortion originating from the charge injection is prevented, a new 
source of distortion is introduced. Let us consider the basic S/H circuit with 
a source follower buffer in Figure 5.3 and assume that there is a nonlinear 
parasitic capacitance from the input of the buffer to the ground. It is in parallel 
with the sampling capacitor and thus the same voltage is sampled into both the 
capacitors. There is no charge redistribution in transition to hold mode and thus 
the nonlinear capacitance has no effect on the sampled voltage. 

Let us now consider either of the circuits in Figure 5.4 or Figure 5.5. In 
sampling mode the input of the buffer is connected to a constant potential, 
while the input voltage is applied to the top plate of the sampling capacitor. In 
transition to hold mode the sampling capacitor is flipped over by connecting 
its top plate to the signal ground. The signal charge is redistributed between 
the sampling capacitor and the parasitic capacitor at the buffer input. Any 
nonlinearity in the parasitic capacitance produces harmonic distortion. The 
main source of nonlinear capacitance is the junction capacitance of the drain- 
bulk diode of the switch transistor. The capacitive loading effect of the source 
follower transistor is small, since both the gate-source and gate-drain voltages 
are almost constant. To minimize distortion the sampling switch should be 
small and the sampling capacitor large; thus there is a tradeoff between speed 
and linearity. 

2.2 S/H Circuit Using Miller Capacitance 

In [ 1 1 2] an interesting approach is used to reduce the signal dependent charge 
injection. The idea is to use the Miller effect to increase the effective capacitance 
in hold mode in order to render negligible the voltage step resulting from the 
charge injection. The sampling is fast and the switch sizes can be kept small 
thanks to the small physical sampling capacitor value, which is not multiplied by 
the Miller effect in sampling mode. The proposed circuit is shown in Figure 5.6. 

The operation of the circuit is as follows. In sampling mode both the switch 
transistors Ml and M2 are conducting and thus the opamp is connected to unity 
gain feedback. The sampling capacitance is formed of the parallel combination 
of the capacitors C l and C2, both connected to the low output impedance of the 
opamp. At the sampling instant the switch transistors Ml and M2 are turned 
off. The transistor M2 operates at a constant potential and thus the charge it 
injects into Cl does not produce distortion. The transistor Ml, however, injects 
an input-dependent charge into node x. Now, since the feedback path around 
the opamp is broken, the effective value of C2 is multiplied by (A+l), where A 
is the open loop gain of the opamp. As a result of the increased capacitance the 
injected charge produces only a negligible voltage change in node x. 
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Figure 5.5. S/H utilizing a linearized source follower buffer [111]. 



The circuit achieves roughly 60-dBc linearity at a 100-MHz clock rate with a 
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with a circuit consisting of M2 and R. Although this is not explained in the 
reference, the resistor R is used instead of a current source, probably in order 
to minimize the supply voltage. If a current source was used, connecting the 
gate of Ml through SI to the ground would cause the current source to drop 
from saturation, which would slow down the transition to hold mode. To keep 
the current source in saturation, SI should be connected to a higher potential, 
increasing the required supply voltage and circuit complexity. The drawback 
of using the resistor is the relatively high current flowing through M2. This 
is due to the fact that the bias current I determines the minimum voltage on 
the resistor R, which cannot be very high, in order to keep Ml in saturation 
when its gate is connected to the ground. The current provided by M2 should 
generate a voltage variation on the resistor R which is equal to the signal swing. 
In practice this means that the average current through M2 during hold mode 
can be larger than the current /. On the other hand, R must be much larger than 
1 / Qm2 to make the drain-source voltage of Ml constant enough. It might be 
impossible to set the value of R so as to satisfy both these and the requirements 
set by the desired operating point. 

In both circuits voltage-dependent charge injection from the switch transis- 
tors is avoided by taking the sample by opening the switch S 1 slightly before the 
input switch S2. Since SI is connected to a constant potential in both circuits, 
the charge it injects to the sampling capacitor is constant. Because the capacitor 
is floating when S2 is opened, its charge injection cannot distort the sampled 
voltage. This technique is called bottom plate sampling and it is discussed in 
more detail in Chapter 6. Since the left terminal of the capacitor is connected 



to a low impedance in hold mode, the attenuation to input signal feedthrough 
is very high. 

Although distortion originating from the charge injection is prevented, a new 
source of distortion is introduced. Let us consider the basic S/H circuit with 
a source follower buffer in Figure 5.3 and assume that there is a nonlinear 
parasitic capacitance from the input of the buffer to the ground. It is in parallel 
with the sampling capacitor and thus the same voltage is sampled into both the 
capacitors. There is no charge redistribution in transition to hold mode and thus 
the nonlinear capacitance has no effect on the sampled voltage. 

Let us now consider either of the circuits in Figure 5.4 or Figure 5.5. In 
sampling mode the input of the buffer is connected to a constant potential, 
while the input voltage is applied to the top plate of the sampling capacitor. In 
transition to hold mode the sampling capacitor is flipped over by connecting 
its top plate to the signal ground. The signal charge is redistributed between 
the sampling capacitor and the parasitic capacitor at the buffer input. Any 
nonlinearity in the parasitic capacitance produces harmonic distortion. The 
main source of nonlinear capacitance is the junction capacitance of the drain- 
bulk diode of the switch transistor. The capacitive loading effect of the source 
follower transistor is small, since both the gate-source and gate-drain voltages 
are almost constant. To minimize distortion the sampling switch should be 
small and the sampling capacitor large; thus there is a tradeoff between speed 
and linearity. 

2.2 S/H Circuit Using Miller Capacitance 

In [ 1 1 2] an interesting approach is used to reduce the signal dependent charge 
injection. The idea is to use the Miller effect to increase the effective capacitance 
in hold mode in order to render negligible the voltage step resulting from the 
charge injection. The sampling is fast and the switch sizes can be kept small 
thanks to the small physical sampling capacitor value, which is not multiplied by 
the Miller effect in sampling mode. The proposed circuit is shown in Figure 5.6. 

The operation of the circuit is as follows. In sampling mode both the switch 
transistors M 1 and M2 are conducting and thus the opamp is connected to unity 
gain feedback. The sampling capacitance is formed of the parallel combination 
of the capacitors C 1 and C2, both connected to the low output impedance of the 
opamp. At the sampling instant the switch transistors Ml and M2 are turned 
off. The transistor M2 operates at a constant potential and thus the charge it 
injects into Cl does not produce distortion. The transistor Ml, however, injects 
an input-dependent charge into node x. Now, since the feedback path around 
the opamp is broken, the effective value of C2 is multiplied by (A+l), where A 
is the open loop gain of the opamp. As a result of the increased capacitance the 
injected charge produces only a negligible voltage change in node x. 
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Figure 5.7. A simplified schematic of a S/H circuit using switched transconductance. 



Figure 5.6. Sample-and-hold circuit using Miller hold capacitance [112]. 



Since the voltage at the input of the output buffer is the same before sampling 
and in hold mode, the nonlinear parasitic capacitances do not distort the signal. 
The dominant distortion source in this circuit originates from the fact that the 
two switches operate simultaneously. The switches are coupled through the 
capacitor Cl, and thus turning off Ml introduces some signal dependence to 
the charge injected by M2 into the capacitor C 1 . This phenomenon is analyzed 
more thoroughly in the reference. 

2.3 Switched Transconductance S/H Architecture 

A CMOS implementation of a known bipolar S/H architecture (e.g. [113]) is 
presented in [114]. The main idea in this architecture is to perform the sampling 
by turning off a MOSFET biased in the saturation region, as opposed to the more 
common practice of operating the transistor switch in the triode region. The 
advantage of biasing the switch in saturation is the fact that then the transistor 
channel is pinched off at the drain end. Consequently, the charge released when 
the transistor is turned off is injected to the source of the device, so it does not 
distort the sampled signal. However, since the voltage at the drain of a saturated 
MOSFET is not strongly defined, the switch must be enclosed in a feedback 
loop. 

A simplified schematic of the architecture is shown in Figure 5.7. The circuit 
consists of an opamp, a sampling capacitor, and a unity gain output buffer. 
The first stage of the opamp is represented with an amplifier symbol, while the 
push-pull-type output stage is drawn with two transistors and a constant voltage 
source. 

In sampling mode the feedback loop is closed and the output voltage, as 
well as the voltage on the sampling capacitor, follows the input voltage. The 




Figure 5.8. An S/H circuit whose gain is determined by resistor ratio [115]. 

sampling is carried out by turning off the output stage of the opamp by shorting 
the gates of the output transistors to their sources. 

In hold mode the feedback loop is broken and the voltage sampled in the 
capacitor is buffered by the unity gain buffer. Feedback in sampling mode 
makes the linearity of the buffer and the capacitor irrelevant also in hold mode. 
Since the output transistors operate in saturation, the distortion caused by charge 
injection is also minimized. 

In [1 14] the authors report 75-dBc SFDR for their pseudo-differential circuit. 
The closed-loop architecture limits the sampling rate to 10 kHz. 

2.4 Closed-Loop S/H Circuit with Resistor Ratio Defined 
Gain 

A closed-loop S/H circuit proposed in [115] is shown in Figure 5.8. In 
tracking mode (with the switch closed) the opamp is connected in an invert- 
ing feedback amplifier configuration and the output voltage of the circuit is 
—R 2 /R 1 • Vim- When the switch is opened this voltage is sampled in the hold 
capacitor Cjq. Since the switch is connected to virtual ground it does not in- 
troduce a signal-dependent charge error. The tracking mode bandwidth of the 
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circuit is extended by adding the capacitor in parallel with the resistor Ri 
to create a zero, which is used to cancel the pole due to the hold capacitor. 

The circuit needs only one clock signal, which makes implementation simple. 
Since the circuit can realize gains other than one, it can be used as an interstage 
S/H in pipelined ADCs. The circuit can achieve quite high sampling rates 
(50 MS/s in [115] and 150 MS/s with a BiCMOS technology in [116]), but 
its track-and-hold nature limits the usable input signal bandwidth to far below 
the Nyquist frequency. Hold mode feedthrough can also become a problem, 
since there is a coupling path from the input to the output through the resistors. 
(A technique to reduce the feedthrough is proposed in [116].) Furthermore, 
resistor matching is known to be worse than capacitor matching in most IC 
technologies. Thus, it is preferable to use a circuit whose gain is determined 
by a capacitor ratio in applications where an accurate gain is needed. 

2.5 S/H Circuit with Capacitor Ratio Defined Gain 

When an S/H circuit with a precise gain (which is generally different from 
one) is needed, a switched capacitor circuit with a capacitor ratio defined gain 
is the best solution. Figure 5.9 shows the S/H circuit used in a pipelined ADC 
design [57], which is widely referred to. The input voltage is sampled passively 
in the capacitor(s) Ci and in hold mode the sampled charge is transferred to 
the capacitor(s) C 2 . The ratio of the held output voltage to the sampled input 
voltage is defined by C 1 /C 2 . In sample mode the capacitors C 2 , as well as the 
opamp’s outputs, are reset. The feedback factor of the circuit depends on the 
capacitor ratio and thus on the gain. The larger the gain is set, the smaller the 
feedback factor becomes, which increases the settling time of the circuit. 




Figure 5.10. A half circuit of a fully differential S/H circuit without a reset phase [118]. 



A common modification to the circuit shown in Figure 5.9 achieves faster 
settling by means of a modified sampling configuration. Instead of resetting the 
capacitor C 2 in sample mode, it is connected in parallel with Ci. Consequently, 
the value of Ci has to be reduced by the value of C 2 to obtain the same gain 
as with the original circuit. The reduction of C\ value increases the feedback 
factor in hold mode, which in turn speeds up settling. The improvement is 
significant only with small (~ 2 ) gain values. 

A modification proposed in [79] adds a sampling switch between the opamp 
inputs. Now the original sampling switches, which are opened slightly before 
the new switch, need only to pass a common mode signal, which allows making 
them small. As a result unbalanced charge injection is reduced. Signal depen- 
dent charge injection reduces as well, since in the new configuration the voltage 
drop across the opamp inputs is only half of the original when the same switch 
size is used. 

2.6 S/H Circuit without a Reset Phase 

The slew rate requirement for an S/H circuit output is set by the difference 
between successive output levels. Resetting the output of an S/H circuit during 
sample mode often makes the requirement tighter than required by the signal. 
A single-ended S/H circuit in which the output in the sample mode stays in 
the vicinity of its last hold level is presented in [117]. In [118] the idea is 
developed rather further by extending the length of the hold phase to the whole 
clock period. The fully differential circuit was later employed in [96] (the 
design is also reported in [1 19]). 

The circuit from reference [118] is shown in Figure 5.10. For the sake of 
clarity the capacitors and switches of only one half circuit are drawn. The sam- 
pling is performed passively with the capacitor Ci in the same way as was done 
in Figure 3.9. During sampling, the capacitor C 3 is connected between Vout~ 
and the bias voltage Vb 2 , which is also the opamp input common mode level. In 




68 



CIRCUIT TECHNIQUES FOR LOW-VOLTAGE AND HIGH-SPEED ADCS 



transition to hold mode the sampling capacitor Ci is connected in parallel with 
C 2 and the bottom terminal of C 3 is switched to the opamp’s negative input. 
Since the voltages on capacitors C 2 and C 3 are complementary, they cancel 
each other out when the capacitors are connected in series configuration at the 
beginning of hold mode. This naturally requires the capacitors to be equal in 
size. 

Cancellation performs the reset of the hold capacitor C 2 and thus no separate 
reset phase is needed. At the end of hold mode the capacitor Ci is disconnected 
from the feedback loop, but the output voltage remains held by the capacitor C 2 . 
Thus the hold phase is effectively extended to overlap with the next sampling 
period. This, however, does not alleviate the settling requirement, since the 
output must be fully settled before Ci is disconnected. 

Due to the lack of a reset phase and the large feedback factor, the circuit 
achieves high sampling rates. A 100-MS/s sampling rate with approximately 
9-bit resolution is achieved in [96]. 



Chapter 6 



SAMPLING WITH A MOS TRANSISTOR SWITCH 



Usually, when used as a switch, a MOS transistor is operated in the triode 
region (or linear region). Then the equivalent circuit for the transistor is a 
resistor whose value is controlled by the transistor gate voltage. When the 
switch is closed the value of the on-resistance is in a range from a few ohms to 
a few kilo-ohms. In contrast, the resistance of an open switch is so high that in 
practice the switch is an open circuit. 

In addition to the finite on-resistance, there are also parasitic capacitances 
associated with the switch. This is illustrated in Figure 6.1, where a simple MOS 
sampling circuit is shown on the left and its equivalent RC circuit, including 
the parasitics, on the right. The capacitances Cpi and Cp 2 are due to drain and 
source junction capacitances and channel-to-bulk capacitance. The gate-to- 
source and gate-to-drain overlap capacitances and gate-to-channel capacitances 
are represented by the capacitors Ci and C 2 . The resistor Rqlk models the 
output impedance of the clock driver. In the sampling circuit of Figure 6.1 
the value of R^lk Pl a Y s an important role in hold mode feedthrough in high 




Figure 6. 1. MOS sampling circuit and its RC equivalent. 
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Figure 6.2. The finite slope of the sampling signal V smp i results in an input voltage-dependent 
sampling delay At. For the sake of simplicity the transistor threshold voltage is assumed to be 

zero. 



frequency applications, since together with Ci and C 2 , it forms a high pass 
coupling path past an open switch. 

1. Voltage-Dependent Turn-Off Moment 

A MOS switch like the one in Figure 6. 1 turns off when its gate-source voltage 
becomes less than the transistor threshold voltage. (This is a simplification, 
accurate enough to understand the problem here. In reality the switch resistance 
is a continuous function of the gate-source voltage, which can be taken into 
account when the sampling operation is described with sampling function, as 
will be discussed in the end of this chapter). When the switch is on, the source 
voltage equals the input voltage. As a result of this and the finite tum-off slope 
a of the gate voltage, the delay At from the moment when the gate voltage 
starts to fall to the switch tum-off moment depends on the input voltage. This 
is illustrated in Figure 6.2. 

The following analysis shows how the voltage-dependent delay is reflected 
in the sampled voltage Vout- Making the assumption that the input voltage 
change during At is small, an expression for At can be written as 

Von ~ Vfjv(nT) Von - A sin(umT) , (61) 

~ f G a 

where the last form is obtained assuming a sinusoidal input. The output wave- 
form can be approximated by 



VourinT) = V/jv(nT + At) ~ Vjjv(nT) T At. 



( 6 . 2 ) 
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For a sinusoidal input this is 

T t ( rr\ a • / m\ Acj cos((jjnT) [Von ~ Asm LmT)} 

VourynT) ~ A sin (onT ) H ; L 5- A. (6.3) 

a 

Expanding the last term yields 

t/ a ■ ( , AuVon , ^ A 2 o;sin(2amT) , , 

VourynT) — Asm (jnT ) H cos(a;nT) (6.4) 

g 2a 

From this it can be clearly seen that the input voltage-dependent turn-off moment 
results in second order harmonic distortion: 

™ 2 = 2 M^)- «“> 

where Aclk is the clock amplitude (Von ~ Voff ) and Tp the clock fall time 
(Aclk/o). For example in a circuit where the clock amplitude is 3 V, signal 
amplitude 0.75 V, signal frequency 50 MHz, and clock fall time 0.1 ns, the level 
of the resulting second harmonic is as high as -48 dBc. 

There are basically three ways to get around this problem. First, making 
the slope of the clock waveform steep reduces the distortion. A second and 
more complicated solution is to make the switch control voltage track the input 
signal (120]. The best solution, however, is to use a circuit topology in which 
the switch is operated around a constant voltage. This is discussed in more 
detail later in this chapter. 

The bulk effect was ignored in this discussion. In practice it makes the tran- 
sistor threshold voltage signal-dependent, which is another source of distortion. 

2. Charge Injection 

A conducting MOS switch has a finite amount of mobile charge in its channel. 
When the transistor is turned off, this charge is distributed between the source, 
drain, and bulk terminals of the device. To design accurate SC circuits the nature 
of this charge injection and redistribution phenomenon must be understood. 
Through the years the charge injection has been analyzed and discussed in 
various papers e.g. [121, 122, 123, 124]. 

Consider the circuit in Figure 6.3. There, a sampling capacitor is driven by 
the voltage source V IN with the source resistance R through a nMOS switch 
transistor. The capacitances C\ and C 2 are associated with the source and 
drain terminals of the transistor, C 2 including the sampling capacitor. The 
distribution of the charge is dependent on the ratio of Ci and C 2 as well as the 
source resistance R and the waveform of the switch control voltage Vq. The 
total amount of the channel inversion layer charge is dependent on the voltage 

V iN . 
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Figure 6.5. The gate voltage turn-off waveform. 



Figure 6.3. A circuit model for understanding the charge injection in MOS switches. 



C G 





Figure 6.4. A circuit model for distributed gate-channel capacitance. 



The amount of the released charge Q tot can be expressed [124] as 

Qtot = Cg{Vgon ~ V T ), (6.6) 

where Cg is the total gate channel capacitance, Vgo n the transistor gate voltage 
in the on-state and V T the effective threshold voltage. A first order approxima- 
tion is that Cg is not dependent on the input voltage and Vp has a linear input 
voltage dependency through the bulk effect. In that case the amount of released 
charge is linearly dependent on the input voltage, which is experimentally ver- 
ified in [121] and [122]. To model the charge more accurately, the nonlinear 
bulk effect and voltage dependency of Cg have to be taken into account. 

As the transistor is turned off, a part of the inversion layer charge is leaked 
to the substrate. This phenomenon, known as charge pumping, is due to two 
effects, the capture of charge by the interface traps and recombination in the 
channel and the substrate. It is shown in [124] that substrate leakage occurs 
only when the gate voltage turn-off slope is extremely steep or the transistor 
channel is very long, and thus in practical switches this effect can be ignored. 

The remaining question is how the inversion layer charge is distributed be- 
tween the drain and the source terminals when the switch is turned off. This can 
be analyzed by using the circuit shown in Figure 6.4 to model the distributed 
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Figure 6.6. A simplified MOS switch model for charge injection analysis. 



gate capacitance and assuming the gate voltage to have the waveform shown in 
Figure 6.5 [124], When the switch is conducting the gate voltage has a value of 
Vgon and when it turned off the voltage is switched to Vqoff with the slope 
a. Using this, the channel conductance g can be written as 

gpcit)] — /3( Va(t ) - Vt) — (3{Vgon - at - Vt ), (6.7) 



where /3 = (W / L)^Cqx ■ Now, the transistor channel during the turn-off 
can be modeled with the time-varying conductance g. The injected charge is 
modeled with current sources having a value of aCc/ 2 in parallel with the 
capacitors Ci and C 2 , as shown in Figure 6.6. The analysis of this circuit yields 
a differential equation 



dV 

dT 



= (T — B) 




C V 

V + 2T-I 

Ui 



- 1, 



( 6 . 8 ) 



where the normalized factors are 



V = 
T = 

B = 



AV 2 




( Vgon-Vt 



(6.9) 

( 6 . 10 ) 

( 6 . 11 ) 
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Figure 6.7 . Charge partitioning as a function of B (reprinted with permission from [35]). 



Solving the equation (6.8) numerically gives the diagram shown in Figure 6.7. 
There, the quantity AQ 2 /Qtot is expressed as a function of parameter B. The 
family of curves represents the solutions with different C 1 /C 2 ratios. When 
the values of B are small the total charge is equipartitioned between the two 
capacitors, regardless of the capacitor ratio. On the other hand, when B is large, 
the charge is partitioned according to the capacitor ratio. The intermediate B 
values result in a partitioning somewhere between the extreme cases. 

Since the parameter B is dependent on the turn-off time through the slope a, 
the meaning of the result can be understood as follows. When the transistor is 
turned off rapidly, the channel is cut off before the potential difference between 
drain and source has time to even out and, as a result, the channel charge is 
equally divided between the drain and source terminals. On the other hand, a 
slow turn-off leaves time for the drain and source voltages to become equalized, 
which results in a charge partitioning according to the capacitor ratio. In this 
analysis the source resistance R is assumed to be infinite. A more complete 
study with finite R values is performed in [123] showing that the smaller the 
source impedance is, the smaller the part of the injected charge that ends up in 
the capacitor C 2 is. 

In practical circuits the slope of the gate voltage is usually in the region where 
the charge partitioning is dependent on the slope and capacitor values. Conse- 
quently, the amount of injected charge is not well controlled and as a result of 
this, different strategies are used and proposed to overcome the problem. First, 
the capacitor Ci in parallel with the driving voltage source and the parameter 
B can be made large, so that the injected charge returns to the driving circuit. 
This, however, increases the capacitive load and makes the circuit slow. Another 
strategy is to make the capacitors equal so as to make the charge injection equal 
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Figure 6.8. Canceling the charge injection with a dummy switch. 



as well. The same result can be achieved by making the parameter B much 
smaller than 1. In the latter two cases, the charge injection can be canceled by 
using a half-sized dummy switch as illustrated in Figure 6.8. 

The first order approximation of the amount of injected charge (6.6) indicates 
that the charge is linearly dependent on the input voltage. In the S/H circuit 
shown in Figure 6. 1 this only affects the gain of the circuit, which is usually not 
very harmful. In practice, the charge injection also has a nonlinear component, 
which results in harmonic distortion. Even the linear component alone can 
cause distortion in some types of SC circuits, e.g. by changing the interstage 
gain in pipelined ADCs. 

When simulating SC circuits in SPICE the designer should be aware that the 
charge injection is not completely modeled in older MOS models. The quasi- 
static transient model, used in those, may give incorrect result, especially when 
the slope of the gate voltage is very steep [125]. In more recent models, such as 
BSIM3v3, more accurate charge injection modeling is achieved by employing 
a non-quasi -static model. 

3. Bottom Plate Sampling 

The discussion in the previous sections has shown that both the signal- 
dependent charge injection and the signal-dependent turn-off moment originate 
from the fact that the switch transistor sees the input voltage in its source termi- 
nal. If the switch was operated around a fixed voltage, the error resulting from 
both phenomena would be constant. A constant error is less harmful in many 
applications and it can be reduced with a differential circuit topology. 

In many closed-loop S/H architectures the sampling switch is connected to 
a virtual ground to avoid signal-dependent errors. An example is shown in 
Figure 6.9. The feedback loop includes two opamps, which inevitably slows 
down the circuit. By using more than one switch, sampling against a constant 
potential can be achieved without enclosing the switch in the feedback loop. 
This well-known technique [126], called bottom plate sampling (also known as 
series sampling), is illustrated in Figure 6.10. The capacitor C is the sampling 
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Figure 6.9. An example of a closed-loop architecture where the sampling switch operates 
against a fixed voltage. 




Figure 6. 10. Bottom plate sampling. 



capacitor and the capacitor is a parallel combination of the input capacitance 

of the following circuitry and the parasitic capacitances associated with the 
switch S2 and the sampling capacitor. 

The idea goes as follows. In sampling mode the switches Si and S2 are 
conducting, while the switch S 3 is open; thus, the input voltage is sampled in 
the capacitor C. At the sampling instant the signal 0/ goes down, opening S2, 
which leaves node Vout floating. Since the switch S 2 is always connected 
to the ground the charge it injects into node Vout is constant. Slightly later, 
the capacitor C is disconnected from the input by opening the switch Si. The 
charge injection and the input voltage variation during the time gap between 
opening S 2 and Si distort the voltage on C. This, however, is not dangerous, 
since the sampled signal is in the form of a charge at node Vout • This charge 
cannot change after S2 is opened, because there is no other DC path from that 
node. The sampling is completed by connecting the left-hand terminal of the 
capacitor C to the ground by closing the switch S3. 

In hold mode the signal can be taken out from node Vout either as a voltage 
or as a charge. If Vout is connected to a high impedance its voltage is just an 
inversion of the sampled input voltage. In practice, the load capacitance C l 
causes some attenuation and, if the capacitance is signal-dependent, harmonic 
distortion. Alternatively, node Vout can be connected to a virtual ground, 
which makes it possible to handle the sample in the form of a charge. 
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4. Nonlinear Time Constant 

In sampling mode the circuit in Figure 6. 1 forms a low pass filter; conse- 
quently, the maximum frequency that the circuit can track is limited. The 3dB 
frequency of the circuit is 



fsdB — 



2tvRon{Cs + C P 2 + C 2 ) 5 



where the clock driver output impedance Rclk is assumed to be zero. If 
absolute accuracy is required then the minimum 3dB frequency is given by [10] 

h d B > (6.13) 

where / is the frequency of the input signal and N the resolution in bits. Using 
these two equations the maximum switch on-resistance can be calculated. For 
example, 12-bit accuracy in the 100-MHz signal band sets Ron to 18 fi, when a 
total capacitance of 2 pF is assumed. This begins to approach the limits of what 
is achievable with a single-transistor switch or a transmission gate with today’s 
technologies. In most circuits, however, some attenuation can be tolerated, 
making higher resolutions and wider bandwidths possible. 

Unlike attenuation, harmonic distortion is intolerable in most applications. 
When the signal amplitudes are large, accuracy and signal bandwidth are limited 
by distortion, which originates from the fact that switch on-resistance and stray 
capacitances are not constant but vary as functions of drain and source voltages. 
For a short channel device the on-resistance is 



CoxHefff [Va - -■ 



Ys_ _v E 
2 2 



■ Vto — 7 {y/Vs — Vb — — y/2 0f)] 



where Vg, Vs, Vb, and Vb are the voltages on the transistor’s gate, source, drain, and bulk 
terminals. By looking at the equation three different signal-dependent terms can be identified. 
The first and clearly dominant one is the gate-channel voltage Vg — (Vs + Vb)/2 in the denom- 
inator. The second is the threshold voltage dependency on the source-bulk voltage (bulk effect) 
modeled with the square root terms in the denominator. The last is the term in the numerator 
which depends on the drain-source voltage, the critical electric field E c , and the device channel 
length. 

The dominant nonlinear parasitic capacitances are the drain and source junction capacitances, 
which are given by [127] 



Cjhx — C jo * ( 1 



Vb-Vx 



where Vx is the drain or source voltage, Cjo the junction capacitance with a zero bias, P p the 
bulk junction potential, and Mj the bulk junction grading coefficient. Actually, the capacitance 
is a sum of two components, the tub bottom capacitance and the sidewall capacitance, which 
both follow equation (6.15) but with different parameters. 
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Figure 6. 1 1. The junction capacitances of an nMOS transistor are linearized with diodes. 

There are basically two ways to reduce distortion: decreasing the absolute value of the time 
constant and making the time constant less nonlinear. The remainder of this section discusses 
these techniques in more detail, reviews known switch circuits, and proposes some new ones. 

4.1 Linearization of Basic Switches 

The commonly used basic switch — the transmission gate — can itself be considered as a 
linearized circuit; as the signal voltage rises the increase in the on-resistance of the nMOS 
transistor is compensated for by the decrease in pMOS on-resistance and vice versa. Similarly, 
as the voltage rises, the drain and source junction capacitances of the nMOS decrease, while in 
the pMOS the opposite happens. 

The relative size of the transistors (the width of the pMOS compared to the width of the 
nMOS) can be optimized in order to minimize distortion, as demonstrated with simulations by 
the author in [10]. In principle, to the first order (ignoring the junction capacitances, the bulk 
effect, and the short channel effects) the time constant can be made flat in the voltage range where 
both the transistors are conducting (from Vt, p to Vdd — Vr,n). Optimum sizing, however, is 
rather sensitive to process parameters and thus is not the same in different process corners. 
Consequently, size optimization can yield only moderate linearity improvements. 

In a single-transistor switch the junction capacitances can be linearized by putting diodes 
of opposite type in parallel with the junctions [128], as shown in Figure 6.11. This makes 
the capacitances more symmetrical about the mid-supply level, which decreases the even order 
distortion. 

The DC level of the signal has a large impact on the harmonic distortion, which is especially 
emphasized with single-transistor switches. When using nMOS switches it is advantageous to 
situate the signal range as low as possible, since switch on-resistance and distortion increase 
rapidly toward high voltage values. In low-voltage designs, however, the signal range is usually 
a considerable portion of the supply voltage and so the DC level cannot be set far apart from the 
mid-supply level. 

Some technologies offer low-threshold devices, which can be used to extend the signal range 
and reduce distortion. In very low-voltage circuits the gained extra gate overdrive of some 
hundreds of millivolts can help a great deal. The low threshold voltage, however, may prevent 
the switch from being properly turned off, which can result in a leakage of the stored charge. 

An effective way to reduce switch on-resistance and to extend the linear range is to employ a 
voltage higher (in the case of nMOS) than the supply used to control the switch transistor gate. 
This is illustrated in Figure 6. 12, where the on-resistances of nMOS switches with different gate 
overdrives are plotted as a function of signal level. There are couple of ways to realize the higher 
voltage. The most straightforward is to supply the voltage externally, which, however, is often 
too costly from the system point of view. Thus, a better solution is to generate the voltage on 
chip. Since the current drain from this supply is relatively small, it can be easily implemented 




Figure 6.12. nMOS switch on-resistance as a function of signal level. 




Figure 6.13. A MOS switch with local gate voltage boost circuit [21]. 



with a charge pump. To avoid potential cross talk problems and routing the voltage around the 
chip, the voltage generation is often distributed. When this principle is adhered to as far as 
possible, each switch has a charge pump of its own. 

4.2 Gate Voltage Boosting 

A switch with a local charge pump circuit is shown in Figure 6.13 [21]. There, the capacitor 
C 1 is charged to Vb d — Vt when elk is high. At the same time, the gate of the switch transistor 
is held in the ground by the transistor M3. When the clock goes down Cl boosts the gate of the 
switch transistor to 2Vd d — Vt- In practice, the voltage is somewhat lower because of parasitic 
capacitances. 

Another local charge pump circuit is shown in Figure 6.14 [71, 129]. In the previous circuit 
the capacitor precharging is carried out through the diode-connected nMOS transistor M 1 . This 
diode switch limits the precharge voltage to Vdd — Vt. In Figure 6.14 the capacitor (now C2) 
is precharged through an nMOS switch M2. The capacitor can be charged to Vd d since the gate 
of M2 is controlled by a boosted voltage generated with Ml and Cl [130]. The well bias for the 
pMOS transistor M3 is produced with another charge pump. More switch boost circuits can be 
found from references [120, 131, 132], 
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Figure 6.14. Another local boost circuit for a MOS switch [71], 




Figure 6. 15. A switch controlled with input tracking gate voltage. 
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Figure 6. 16. SC implementation of the offset voltage source. 



The voltage source in Figure 6.15 can be implemented with a switched capacitor, which is 
precharged in every clock cycle. Such a circuit is shown in Figure 6. 16. During the clock phase 
when the transistor is non-conductive the capacitor Ci is precharged to V\ — V 2 . To turn the 
switch on, the capacitor is switched between the input voltage and the transistor gate. The gate 
voltage, however, is not exactly the sum of the input voltage and the precharge voltage, since 
the parasitic capacitances associated with the switch transistor and the auxiliary switches cause 
some distortion. The gate voltage is given by 



Ci(Vi-V 2 ) (a+Cz + CajViNjt) 
Ctot Ctot 

C‘ 2 VlN(to) _ CsVouHto) 

Ctot Ctot 

„ . t r l., CiVl N (t) C 2 V IN {t 0 ) 

= vo + v[N(t ) y y 

Oioi C'tot 

CzVourito) 

c^t ’ 



(6.16) 

(6.17) 



where Ctot is the total capacitance (C i + C 2 + Cz + C 4 ). In the second form the first two 
terms — the offset voltage Vo and the input voltage — are the desired ones, while the three others 
are unwanted. The capacitive division between the capacitances from the input node to the gate 
and from the gate to the signal ground results in a term proportional to C 4 . The last two terms, 
proportional to the drain and source overlap capacitances of the switch transistor, result from 
the fact that at the end of the switch off-phase (time t 0 ) the voltages at the source and at the 
drain are sampled into the parasitic capacitances. In order to minimize distortion, the parasitic 
capacitances have to be minimized and the bootstrapping capacitor Ci made large. 



4.3.2 Circuits from the Literature 

A practical implementation of the bootstrapped switch is shown in Figure 6.17 [133, 102]. 
There, the offset voltage is realized with the capacitor Cl, which is precharged to Vdd during 
the main switch off-period. To turn on the switch MS, the precharged capacitor is connected 
between its source and gate via the series switches Ml and M4. Turning off MS is performed by 
disconnecting the capacitor Cl and pulling down the gate with M6. The transistor M5 is needed 
to prevent the gate-source voltage of M6 from exceeding Vdd. 

A slightly modified version of this circuit is presented in [134] and shown in Figure 6.18. 
There the nMOS transistor M3 is replaced with a pMOS transistor whose gate is tied to the gate 
of MS. Consequently, the charge pump on the left in Figure 6.17 can be eliminated. 
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Figure 6.17 . A long-term reliable bootstrapped switch [133, 102]. 
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Figure 6. 18. Another bootstrapped switch [134]. 



Some other bootstrapped switches are presented in [135], [136], and [137]. These circuits, 
however, are not targeted on deep submicron technologies and thus not necessarily suitable for 
low supply voltages. In baseband delta-sigma modulators the signal does not change much 
within a clock cycle, making it possible to bootstrap the switch with a signal value sampled half 
a clock period earlier [138, 139]. 

4.3.3 Eliminating the Bulk Effect in Tripple Well Technology 

The linearity of these circuits is limited by second order effects, the two most important of 
which are the bulk effect and the voltage dependent junction capacitances. In a triple well process, 
an example of which is a typical BiCMOS processes, both of them can be almost completely 
eliminated by connecting the switch transistor bulk to the input node during the tracking phase. 
In the circuit shown in Figure 6. 1 8 this can simply be done by connecting the bulks of transistors 
MS and Ml to node n 1 . They cannot be connected directly to the input, because when the switch 



Figure 6.19. Simulated distortion as a function of signal source output impedance. 



is open the voltage Vin may go far below Vout , making the diode from the output node to the 
bulk forward biased. In some circuits a buffer [136, 137] or a resistive circuit [140] is inserted 
between the input and the bulk node to isolate the capacitance of the well-to-substrate junction 
from the input. This, however, increases the circuit’s complexity and introduces phase shift 
between the input and the bulk voltage. Thus, the advantage gained is questionable. 

In some cases the elimination of the junction capacitances worsens the distortion, because 
normally the capacitance nonlinearity partially cancels the on-resistance nonlinearity; for in- 
stance, in an nMOS device the junction capacitance decreases, while the on-resistance increases, 
as the voltage level rises. Furthermore, the bulk well, which is now connected to the input, 
has a junction capacitance of its own, which can be in the same size range as the eliminated 
capacitance of the source and drain junctions. Whether the capacitance nonlinearity dominates 
over the bulk effect or not depends on the output impedance of the signal source. 

This is investigated with a transient simulation, which is repeated for different values of signal 
source output impedance. The levels of the second and the third harmonic are identified from the 
spectrum of the tracked voltage across the sampling capacitor. The simulated circuit is a simple 
SC sampling circuit utilizing the switch shown in Figure 6.18, with the bulks of the main switch 
transistor MS and the input- or output-connected auxiliary nMOS transistors (Ml, M8, and M9) 
made to track the signal. For comparison the simulation is also performed for the same circuit 
with the bulks connected to Vss- The signal source is a 44-MHz, 3.0-Vpp sinusoidal voltage 
source with a series resistance Rint . An equal termination resistor Rterm is put in parallel 
with the sampling circuit, resulting in a 1 .5-Vpp input signal for the switch. 

The results are shown in Figure 6.19, which proves that especially the level of the second 
harmonic has a strong dependency on the resistance. Moreover, in the switch with bulk node 
connected to a constant potential the two nonlinearity sources partially cancel each other out, 
resulting in a frequency-dependent distortion minimum, in this case, at 240 Q. In the third 
harmonic, which is more important in fully differential circuits, a similar clear cancellation 
cannot be seen. At low resistance values the circuit with input tracking bulk yields lower 
distortion and thus is a preferable choice for a 50-J2 system in the simulated case. 
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Figure 6.20. A bootstrapped switch without bulk effect in a standard CMOS technology. 
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When a triple well process is not used the cancellation effect discussed above can be exploited 
to improve the linearity in a limited frequency range by adjusting the output impedance of the 
signal source, for instance with a resistor in series with the switch. 

In some switch circuits a buffer [136, 137] or a resistive circuit [140] is inserted between 
the input and the bulk node to isolate the capacitance of the well-to-substrate junction from the 
input. This, however, increases the circuit’s complexity and introduces phase shift between the 
input and the bulk voltage. Another possibility is to let the bulk node float during the tracking 
phase, resulting in a parasitic structure, which is a series connection of two diodes of opposite 
type. Now the capacitance is smaller and less nonlinear, but as a drawback the bulk effect is 
only partially eliminated. However, in a 50-0 environment this is often a good compromise, 
especially when a low third harmonic is of interest. 

4.3.4 Eliminating the Bulk Effect in Standard CMOS Technology 

The improvement in linearity achieved by making the bulk node track the input is so remark- 
able that accomplishing the same with a standard CMOS technology would be desirable. The 
problem there is that the bulk node of the nMOS device (in a typical process with p-type sub- 
strate) cannot be accessed. Thus, the switch transistor has to be implemented with an inherently 
more resistive pMOS device. 

Let us consider the circuit shown in Figure 6. 1 8 and change every nMOS transistor to pMOS 
and vice versa. Most of them can be changed without problems, but not the transistors M3 and 
M4, because now nodes n2 and n3 go below Vss , which would make the junction diode of any 
nMOS device connected to these nodes forward biased. 

Author’s proposal as to how to overcome this problem is shown in Figure 6.20 [2]. There, 
the device M3 is implemented with a diode-connected pMOS device as in Figure 6. 1 3, but it can 
also be implemented as in Figure 6.17. 



Figure 6.21. Simulated waveforms in the circuit shown in Figure 6.20. 



The implementation of the series switch between the gate of the transistor MS and the boot- 
strap capacitor is more difficult. The switch has to be a pMOS transistor with the ability to 
conduct at voltage levels much below Kss. Thus, its gate voltage has to be bootstrapped. 

The operation of the circuit can best be understood by looking at the simulated voltages in 
Figure 6.21. When the main switch is turned off, its gate node n3 is shorted to Vdd and the 
gate of M4 is connected to n3 via M7. The off-phase ends when elk goes up. First, M7 is 
turned off and M8 on, and, as a result, the voltage at node n5 starts to go down. Since n3 is 
still shorted to Vdd , a voltage appears across the capacitor C2. A short delay later, clk2 also 
goes up, releasing node n3, which now starts to go down as M4 opens. When the voltage at 
n5 reaches the threshold level of M8, the transistor automatically turns off, leaving node n5 
floating. The bootstrap capacitor C2 makes n5 follow n3 with an offset large enough to keep M4 
properly conducting. To lower the threshold voltage of M4 its bulk is switched to Vss during 
the on-phase. 

To ensure that the circuit is reliable in the long term it has to be made sure that the gate-source 
or gate-drain voltage of the transistors connected to nodes n2, n3, and n5 will not exceed Vdd . 
Since the gate voltage of M4 follows node n3 there are no problems with that device. The 
reliability of M7 and M8 is guaranteed by connecting their gates to n4 instead of Vdd , when it 
is desired that the devices are off. 

Since the transistor M4 is not a very large device and its on-resistance does not need to be 
very linear, the second bootstrap capacitor C2 and the transistors M7 and M8 can be fairly small. 
Thus, the added parasitic capacitance is not large. 

The linearity of the proposed circuit has been compared with the circuit shown in Figure 6. 1 8 
with transient simulations. The results show a lower third harmonic for the proposed circuit at 
signal source resistance values greater than 250 fl and a lower second harmonic at all resistance 
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Figure 6.23. High-pass feedthrough paths past a closed switch and a technique to cancel the 
feedthrough. 



Figure 6.22. Double-side bootstrapped switch. 



values except the optimum 240 Q of the reference circuit (see Figure 6.19). In conclusion, the 
circuit is the preferable choice in single-ended applications with a low to moderate impedance 
signal source (e.g. 50 FI) and in all applications with a high impedance signal source (>250 Ft). 

Two other standard CMOS circuits for eliminating the bulk effect have been proposed in the 
literature. In the first one [141] the bootstrap capacitor is connected directly, without a series 
switch, to the gate of the switch transistor. During the off-state this capacitor terminal is shorted 
to Vdd and the other terminal is charged to 2 Vdd with an additional charge pump. 

The second circuit [ 1 42] removes the bulk effect, but not the effect of the junction capacitances. 
There, the signal voltage is not directly connected to the gate of the switch device, but is first 
predistorted to account for the bulk effect. This is done by generating a proper gate-source 
voltage with a dummy switch, whose source tracks a buffered signal voltage. The dummy 
switch operates in the saturation region and it carries a constant current. 

4.3.5 Double-Side Bootstrapping 

By looking at the on-resistance equation (6.14) it can be seen that the dominant signal- 
dependent term V G - (Vs + Vb)/2 is not made exactly constant with the circuits presented thus 
far, since, although the drain and source voltages are close to each other, they are not the same 
because of the signal current through the switch; thus, the achievable linearity is limited. 

A proposal as to how to overcome this problem is shown in Figure 6.22. There, the idea is 
to make the switch transistor gate follow the average of the drain and the source voltage rather 
than either of them alone. This is accomplished by adding another bootstrap capacitor on the 
output side of the switch transistor, and thus a suitable name for the technique is double-side 
bootstrapping. The total capacitance need not be increased, since the original bootstrap capacitor 
can be divided into two equal pieces without sacrificing accuracy. It appears from simulations 
that the improvement achieved in linearity can be of the order of 5 decibels on the level of the 
2nd harmonic. 

A slightly different double-side bootstrapped switch is proposed in [143], however, not for 
enhancing linearity, but for improving reliability at the beginning of the on-phase when it is not 
known which side of the switch transistor is in lower potential. 



4.3.6 Reducing Feedthrough 

In high-frequency high-resolution applications the switch transistor is very large, even when 
bootstrapping is applied. The large transistor has large parasitic capacitances from the drain and 
the source to the gate and bulk nodes. In a bootstrapped switch the parasitic capacitances of the 
auxiliary circuitry have to be minimized, and thus no large devices are allowed. Consequently, 
the impedances shorting the gate and the bulk of the switch transistor to the ground during the 
switch off-phase are not negligible, which results in high-pass feedthrough paths past the closed 
switch, as illustrated in Figure 6.23 (a). Feedthrough is a severe problem, especially in S/H 
circuits, where the high frequency signal is present in the input whether the switch is open or 
closed. 

A technique for reducing feedthrough is proposed in Figure 6.23 (b). It is based on the idea 
that in differential circuits the complementary signal can be used to cancel the feedthrough by 
generating a signal with equal magnitude but opposite phase. This is done by forming capacitors 
equal to Cl and C3 with a dummy transistor half the size of the switch and connecting these 
capacitors to the complementary input signal. The cancellation achieved is remarkable, but not 
perfect because of the signal dependency of the capacitor values. 

In Figure 6.24 the proposed technique is applied in a bootstrapped switch realized with a 
triple-well process. During the switch on-phase the dummy transistor is connected to the same 
input as the switch to make the voltage over the junction diodes constant, which is essential 
in order to avoid distortion. Besides feedthrough reduction, the technique also brings another 
advantage: in the gate voltage equation (6.17) the term proportional to the input voltage at the 
end of the off-phase is also canceled. This switch, as a double-side bootstrapped version, has 
been employed as an input switch in the IF-sampling ADC presented in Chapter 12. 

4.3.7 Bootstrapped Switch as a Sampling Switch 

Thus far the presented circuits are intended to be used as an input switch (switch Si in 
Figure 6.10) in a circuit using the bottom plate sampling technique. Since the sampling switch 
(S 2 ) is connected to the ground from its other end, it does not see the signal swing and thus its on- 
resistance is allowed to be signal-dependent. In high frequency applications, however, the signal 
current may have such a large amplitude that it produces a significant voltage drop in sampling 
switch on-resistance (For example, a 100-MHz, 1-Vpp signal sampled in a 5-pF capacitor with a 
switch which has 1 0-S2 on-resistance results in a 3 1 -mVpp voltage drop). As a result, the charge 
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Figure 6.24. A bootstrapped switch with feedthrough cancellation. 



released when the switch is opened is not totally signal-independent. Bootstrapping the sampling 
switch reduces the on-resistance as well as makes the charge injection less signal-dependent. 

To be used as a sampling switch, the circuit shown in Figure 6. 18 has to be slightly modified 
in order to guarantee that the charge escaping from the bootstrapping circuit does not distort the 
sampled signal charge. This can be done by turning off the transistor Ml slightly before the 
switch transistor MS, which can be realized by connecting Ml’s gate to elk instead of node n3. 
Now the switch Ml does not have an on-resistance as small and signal-independent as before 
but, because of the small signal swing seen by the sampling switch, this is not critical. 

5. Sampling Function 

In the previous discussion the effect of the finite on-resistance of a MOS switch was modeled 
with a limited bandwidth in tracking mode. The same thing can also be thought of in a different 
way. As a result of the low-pass filtering effect the voltage across the sampling capacitor depends 
not only on the instantaneous value of the input voltage but also on its previous values. When 
a sample is taken it is a weighted average of the input values from the beginning of the time to 
the sampling moment. In practice, only a short time period has to be taken into account. The 
weighting can be modeled with the sampling function and the averaging with integration, as 
done in equation (3.5). 

In a real circuit the resistance of the MOS switch cannot be changed from its constant on-value 
(the nonlinear effects are neglected in this discussion) in tracking mode to infinite in zero time. 
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This is due to the finite slope of the transistor gate voltage and the gate voltage feedthrough to e 
source and drain via the parasitic capacitances. Since there is no single time instant which can 
be said to be the sampling moment, the sampling must be modeled with the sampling unction. 

A thorough analysis of the nMOS transistor sampling function based both on analytic for- 
mulation and simulations is presented in [144], The authors define the aperture time as the time 
interval which covers 80% of the sampling function area. Their analysis shows that for y 
small sampling capacitor values the aperture time has an almost linear dependence on the gate 
voltage fall time. On the other hand, when the sampling capacitor is large, the sampling functio 
is dominated by the tracking mode time constant, which is typically the case in S/H circuits. Sub- 
samplers and digital high-speed line receivers are applications where the bandwidth reduction 
resulting from the finite turn-off time is remarkable. 




Chapter 7 

OPERATIONAL AMPLIFIERS 



The opamp is a widely used building block in many types of analog circuits. 
Often, it is just the spot where the limits of the technology are first met, when 
trying to enhance the speed or reduce the power consumption of the circuits, n 
the SC technique and in the pipelined ADC based on it the opamp is a central 

component. 

Opamp design methodology and various circuit topologies have been thor- 
oughly covered in many textbooks. Thus, presenting a comprehensive study 
in this context does not serve the present purpose; indeed it may not even be 
possible Instead, this chapter tries to concentrate on issues related to low- 
voltage and high-speed design with modem IC technologies. The requirements 
of opamps in SC circuits are reviewed and the most important and suitab e 
circuit topologies, with their pros and cons, are compared. 



1. Requirements for SC Applications 

The maximum speed and, to a large extent, the power consumption of SC 
circuits are determined by the opamp. The opamps in SC circuits have some 
unique requirements, the most important of which is the input impedance, which 
must be purely capacitive so as to guarantee the conservation of charge. Con- 
sequently, the opamp input has to be a MOSFET, either in the common source 
or the source follower configuration. Since it is not possible to employ a BIT 
as an input transistor, the speed and power savings offered by the BiCMOS 



1.1 Output Impedance 

Another characteristic feature of SC circuits is the load at the opamp output, 
which is typically purely capacitive. As a result, since there is no need to drive 
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resistive loads, the opamp output impedance can be high, making it possible 
to use operational transconductance amplifiers (OTAs). The output stage of an 
OTA provides significant voltage gain, enabling the achieving of the gain target 
with a smaller number of stages. 

1.2 Output Voltage Range 

The opamp output voltage range, as already discussed in Chapter 2, has a ma- 
jor impact on the signal-to-noise ratio. Consequently, maximizing the voltage 
swing is especially important in low-voltage and high resolution applications. 
Unfortunately, an output stage with high signal swing cannot usually provide 
very high output impedance, thus increasing the number of opamp stages. High 
output swing may also increase noise, since the output stage current sources 
cannot be optimally sized for low noise. 

In fully differential opamps the common mode voltage level at the output 
is not automatically determined. To set it to the wanted level (typically in the 
middle of the supply voltages) a common mode feedback (CMFB) circuit has 
to be used. 

1.3 Input Common Mode Range 

SC circuits typically employ opamps in the inverting feedback configuration 
(and are fully differential, making signal inversion possible simply by crossing 
the wires), which does not require a large common mode voltage range in the 
opamp input. Thus, low-voltage circuits can be constructed with opamps that 
do not have a rail-to-rail input stage. If present, the single-ended to differential 
conversion in the circuit front-end, however, may need an opamp in the non- 
inverting configuration. In addition, a fully differential input signal without an 
exactly-known common mode level requires some common mode input range 
from the opamp of the front-end stage to accommodate signal CM voltage 
uncertainty or changes in it. 

In SC circuits the opamp input common mode level need not be equal to 
the output CM level, which is usually set to Vdd / 2 so as to maximize signal 
swing. This freedom can be utilized in low voltage circuits, for example when 
an nMOS input pair is employed, by setting the CM level close to the Vdd , 
which leaves more voltage headroom for the input pair and the tail current 
source. In principle, the CM level can be raised all the way to Vdd , but then 
care must be taken that the junction diodes of the pMOS switches attached to 
the opamp input do not become forward biased during the settling. 

1.4 DC Gain 

The ultimate settling accuracy is limited by the finite opamp DC gain. What 
the exact settling error is depends not only on the gain but also on the feedback 



factor in the circuit utilizing the opamp. Typically, the DC gain requirement 
is from 60 dB up to 100 dB. In some circuits, such as a front-end S/H circuit, 
insufficient opamp DC gain results only in a gain error which is usually tolerable. 
The DC gain, however, has to be constant over the opamp output voltage range 
in order to avoid harmonic distortion. 

1.5 Bandwidth and Phase Margin 

When using a single pole model for the opamp, the settling time is determined 
by the gain-bandwidth product (GBW) of the opamp and the feedback factor of 
the circuit. In practical circuits, there is almost always more than just one pole 
and often zeros as well. However, in order to use the opamp in a closed loop 
configuration, it has to be designed in such a way that its frequency response 
is close to the single pole response. Consequently, there is one dominant low- 
frequency pole, while the other poles and zeros lie at much higher frequencies. 
In the frequency response their presence is seen as a phase roll-off in the high 
frequencies. Thus, the phase margin at the unity gain frequency has an effect 
on the settling time as well. 

If the opamp is not utilized in unity gain feedback (e.g. auto-zeroing) the 
required phase margin is not defined at the unity gain frequency but at the 
frequency of the closed loop gain, and so it is easier to achieve. 

The fastest settling is obtained when the first overshoot of the step response 
just touches the upper settling bound f 145]. The higher the accuracy require- 
ment, the closer the optimum becomes to the critically damped settling which 
corresponds to a 76° phase margin in a two-pole system. This is significantly 
larger than the 60° rule of thumb used for continuous time circuits and men- 
tioned in some textbooks in conjunction with SC circuits. 

Sometimes, when there is more than one non-dominant pole, zeros, pole-zero 
doublets, or complex pole pairs in the circuit, the phase margin does not give 
a good indication of the settling time; it can be significantly longer or shorter 
than in a two-pole system with the same phase margin. 

1.6 Slew Rate 

Besides the opamp bandwidth, the settling time is limited by the fact that the 
opamp can supply only a finite current to the load capacitor (or the compensation 
capacitor). Consequently, the output cannot change faster than the slew rate, 
which is given by 

SR=I -^> (7.1) 

where Cl is the load capacitance and Isr the available slewing current. When 
designing an opamp, the load capacitor is known and the required slew rate 
(SR = k • V max lTs) can be calculated from the largest voltage step ( V max ) 
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and the clock period (Ts), half of which is typically available for settling. A 
commonly used rule of thumb suggests that one third of the settling time should 
be reserved for slewing, resulting in k of six. 

The required slewing current is 

T k • VmaxC'L t-j 

1SR = TP • 

is 

It is linearly dependent on the clock frequency, while the current needed to 
obtain the opamp bandwidth has a quadratic dependence, which means that in 
high speed circuits the opamp current often needs to be higher than required 
by the slew rate [5]. On the other hand, in slow- and medium-speed circuits 
the slew rate keeps the current unnecessarily high. Since the slewing current 
is needed during only a fraction of the clock period, remarkable power savings 
can be achieved if the current can be adjusted according to the need. This can 
be accomplished either by using a class AB output stage or by dynamic biasing. 
The latter, however, has not gained popularity, while the class AB output stages 
are widely employed. 

1.7 Noise 

In high speed Nyquist-rate ADCs opamp noise is dominated by thermal 
noise, while 1// noise is less important. Consequently, for noise reasons, there 
is no point in using a pMOS input pair (which has inherently lower 1// noise) 
in the opamp. 

The total noise contribution of all the devices in the opamp is usually com- 
bined as a single voltage source at the amplifier input. Assuming the noise 
sources to be uncorrelated, the total noise is obtained as a root of the sum of the 
squares of the individual input-referred noise sources. The noise contribution 
of the devices in the opamp’s first stage is the most significant, and usually the 
noise of the other stages can be neglected, since it is attenuated by the preceding 
voltage gain. 

For MOSFETs the gate-referred thermal noise is given by 

kT 

vl = 4 7 — A /, (7.3) 

9m 

where T is the absolute temperature, k Boltzmann’s constant, A / the differ- 
ential frequency, g m the transistor small signal transconductance, and 7 the 
noise excess factor, which is 2/3 for long channel devices (L>\ .7 g,m ). In short 
channel devices the hot carrier effects increase the noise, leading to a larger 
value of 7. It has been experimentally shown [146] that for 0.7-/im devices the 
value of 7 ranges from 2.5 to 9, depending on the bias conditions. Analytical 
models for 7 have been proposed in [147] and [148] — unfortunately, they are 
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too complicated for hand calculations. In general, however, 7 increases as the 
gate-source voltage decreases and/or as the drain-source voltage increases. 

The input-referred noise of the opamp input pair is directly given by equation 
(7.3). Thus, it is reduced by increasing the transconductance, which can be done 
by utilizing nMOS devices, increasing the current, or increasing the aspect ratio 
of the devices. The effect of the last method, however, is partially canceled by 
the increase in 7. 

When referred to the opamp input, the noise voltages of the transistors used 
as cunent sources (or mirrors) in the first stages are multiplied by the transcon- 
ductance of the device itself and divided by the transconductance of the input 
transistor. As a result, the total input-referred noise is 

<opamp = 4-^^A /• 6 (7 . 4) 

9m, in y Kin9m,in 'Tin9m,in J 

which again suggests that maximizing input pair transconductance minimizes 
noise. It can be further reduced by decreasing the transconductances of the 
current sources. Since the current is usually set by other requirements, the 
only possibility is to decrease the aspect ratio of the device. This leads to 
an increase in the gate overdrive voltage Vqs — Vt, which, as a positive side 
effect, also decreases 7. It should be noticed that the overdrive voltage is equal 

Vdsat- Consequently, obtaining low noise with low supply voltage is difficult, 
especially with single stage opamps, where the output signal swing does not 
permit large V^sat- Increasing L to avoid short channel effects is also possible, 
but with a constant aspect ratio it increases the parasitic capacitances, reducing 
the opamp bandwidth. 

Cascode transistors do not make a significant contribution to noise, be- 
cause their noise voltage is transformed into current through the high output 
impedance of the underlying current source. 

When looking at equation (7.4), it is clear that opamp topology does not affect 
the noise contribution of the input pair except through the device type (nMOS 
or pMOS). On the other hand, the number, and to some extent, the magnitude of 
the terms in the parentheses are dependent on the topology. When comparing 
opamp topologies it is convenient to write the term in parentheses as 1 + j OA , 
where 70.4 is referred to as opamp noise excess factor. 

2. OTAs with Single High-Gain Stage 

OTAs with a single gain stage have been widely employed in SC circuits. 
A high output impedance provides an adequate DC gain, which can be fur- 
ther increased with gain boosting techniques. Single-stage architecture offers 
large bandwidth and a good phase margin with small power consumption. Fur- 
thermore, no frequency compensation is needed, since the architecture is self- 
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Figure 7.2. Folded cascode OTA [149]. 



Figure 7. 1. Telescopic OTA [118]. 

compensated (the dominant pole is determined by the load capacitance), which 
makes the footprint on the silicon small. On the other hand, the high output 
impedance is obtained by sacrificing the output voltage swing, and the noise is 
rather high as a result of the number of noise-contributing devices and limited 
voltage head-room for current source biasing. 

2.1 Telescopic OTA 

The telescopic OTA [118], shown in Figure 7.1, is probably the fastest pos- 
sible architecture. Both the GBW and the lowest non-dominant pole are de- 
termined by nMOS devices, resulting in both large bandwidth and good phase 
margin. The number of current legs being only two, the power consumption is 
small. 

The most prominent drawback of this architecture is the limited voltage 
swing both at the output and the input of the opamp. From the high side the 
output swing is limited to 2 Vdsat below Vod and from the low side a minimum 
of 3 V dsa t above V S s- With this maximum possible output swing the input 
common mode range is zero. In practice, some input CM range, which reduces 
the output swing, always has to be reserved so as to permit inaccuracy and 
settling transients in the signal common mode levels. With supply voltages of 
5 V or larger the voltage swing, however, is often more than sufficient, so that 
even an extra set of cascodes can be inserted in both the nMOS and pMOS sides 
to enhance the DC gain. But, when the supply is 3 volts or less, the swing is 
too small for most SC applications. 



2.2 Folded Cascode OTA 

The folded cascode OTA [149], shown in Figure 7.2, is probably the most 
commonly used opamp architecture in SC circuits. It provides a larger output 
swing and input CM range than the telescopic OTA with the same DC gain 
and without major loss of speed. The output swing, Vjjd ~ 4 Vd sa t> is not 
linked to the input CM range, which is Vdd ~ V T ~ 2 Vdsat (obtained using 
= Ft + Vdsat)’ 

The choice between an nMOS and pMOS input pair has to be made on the 
basis of the required phase margin. The nMOS input architecture, shown in 
Figure 7.2, offers large GBW (</™i/Cl) thanks to the nMOS input transistors, 
but the lowest non-dominant pole (g m 6/Ci) associated with the node nl is 
determined by the low pMOS transconductance and the large stray capacitances 
of the pMOS current sources and the cascode devices. On the other hand, 
utilizing a pMOS input pair gives lower GBW, but the non-dominant pole is 
higher, thanks to the nMOS cascode devices. 

Feedforward capacitors can be used to bypass the cascode transistors at high 
frequencies to improve the phase margin [150, 151, 152, 153]. In principle, 
the technique produces a zero, which is used to cancel the pole associated with 
the cascode node. It is, however, not possible to place this zero exactly on 
top of the pole. Thus, there is a sufficiently closely spaced pole-zero pair, a 
doublet, which is known to introduce a slowly settling component in the step 
response [154]. Consequently, feedforward techniques should be used with 
care in opamps employed in SC circuits. 

It is possible to employ an nMOS and a pMOS input pair in parallel [155], 
which increases the slew rate by 1/3 (with the same total current consumption), 
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but at the same time increases the input capacitance and thermal noise and 
lowers the non-dominant pole. Another possible way to increase the slew rate 
and ensure that all transistors remain in saturation during slewing is to clamp 
the cascode nodes with diode-connected devices [156], 

An important fact, which is not mentioned in many textbooks, is that impedance 
seen at the cascode node is actually high at the DC [157]. The impedance is 
equal to the output impedance of the opamp, attenuated by the gain of the cas- 
code device ( g m /9ds )• Thus, in a folded cascode structure the impedance of the 
cascode node is in the order of (2 /g ds ). When going to higher frequencies the 
load capacitance starts to dominate the opamp output impedance and, as a result, 
the impedance at the cascode node decreases without forming a low-frequency 
pole in the frequency response. 

In SC circuits high cascode impedance can be harmful, since it results in a 
Miller multiplication of the gate-drain capacitance of the opamp input device. 
In an SC amplifier a significant amount of charge intended for the feedback 
capacitor ends up in the opamp input capacitances, resulting in a gain error. 
The Miller effect can be avoided by inserting extra cascode transistors on top 
of the input pair [158, 159]. In the case of an nMOS input pair, the added 
non-dominant pole is much higher than the one already present in the transfer 
function and thus the phase margin is not reduced significantly. An alternative 
method to get rid of the Miller effect is to put capacitors, matched to the gate- 
drain capacitance of the input device, between the gate of the input transistors 
and the drain of the complementary input transistor [68]. 

In BiCMOS technology the cascode transistors in a folded cascode opamp 
can be implemented with bipolar devices, resulting in a considerably high non- 
dominant pole. 

2.3 Cascode Stage with Low-Gain Preamplifier 

In addition to folding, another way to get the input pair current to the cascode 
output stage is current mirroring. The resulted circuit provides a better slew 
rate than the folded cascode, but introduces another non-dominant pole, which 
becomes lower as the current mirroring ratio is increased. 

A closely-related architecture is shown in Figure 7.3 [7 1 ]. Instead of a current 
mirror, the first stage load is a pair of common gate-connected nMOS devices 
(M3 and M4) and the signal is taken into the output stage from the nMOS side. 
As a result, the use of pMOS devices in the signal path is avoided, giving a large 
GBW and pushing the non-dominant poles into high frequencies. There are, 
however, two non-dominant poles, one associated with the first stage output 
and the other with the cascode node, which makes the phase roll-off steep once 
it begins. To make sure that the first stage output pole is high enough, the g m 
of M3 and M4 has to be large, limiting the first stage gain (or current mirroring 
ratio) to sufficiently small values, typically smaller than two. Unfortunately, 
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Figure 7.3. OTA with low-gain preamplifier [71]. 

the thermal noise increases as the first stage gain is decreased and thus this 
topology is unsuitable when low noise is required. Also, the input CM range is 
fairly limited. 

In BiCMOS technology the output stage nMOS devices can be replaced with 
npn transistors, resulting in a very high GBW. 

2.4 Comparison of Single-Stage OTAs 

The main characteristics of the OTAs presented above are collected in Ta- 
ble 7. 1 using the following notations: g mx is the transconductance of transistor 
Mx, Cl is the load capacitance, C nx is the parasitic capacitance associated with 
node nx, and Is is the input pair tail current. In the third OTA m is the ratio 
of the aspect ratios of M6 and M3 and n is the ratio of the output stage current 
to the input stage current. The noise is given with the noise excess factor 7 oa- 
It can be seen that each architecture has its pros and cons, suggesting that the 
choice has to be made on the basis of which specifications are important in the 
target application. 
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2.5 Gain Enhancement Techniques 

In many applications the opamp DC gain requirement is higher than what 
is achievable with simple single-stage topologies. Techniques to enhance the 
opamp DC gam without going into multi-stage architectures are especially 
welcome m high speed circuits, where the high current levels make the transistor 
[Ids large. 

A very widely-used method is based on improving the cascoding effect of 
a single MOS transistor by using local negative feedback [160], The resultant 
circuit, often referred to as regulated cascode, is utilized in a current source 
that is shown m Figure 7.4. There, the auxiliary amplifier encloses the cascode 
transistor M2 in a feedback loop, making the voltage on its source node almost 
constant. As a result, the output impedance of the current source is given by 

„ Aig m 2g m \ 

rout . n 5 ) 

gds2gdsi v ' 

Thus the regulation improves the impedance by the gain of the regulation 
amplifier Ai and, when the current source is utilized in an OTA the DC gain is 
increased by the same amount. 

Three different implementations of the regulation amplifier are shown in 
Figure 7.5. The one in Figure (a) [160, 161] is very simple, but sets the voltage 
on the cascode node unnecessarily high. The circuit in Figure (b) [162] utilizes 
a level shifter and the other one in Figure (c) [163] a common gate amplifier, 
to allow the biasing of the cascode node to a lower voltage. Using a more 
comphcated regulation amplifier, e.g. a folded cascode OTA, is also possible, 
n fully differential circuits the regulation amplifier can also be fully differential. 

The regulated cascodes were for the first time utilized in an opamp in [164], 
where the DC gain of a folded cascode OTA was boosted to 90 dB It was 
shown that if the GBW of the regulation amplifier is larger than the dominant 
pole of the unregulated opamp, the regulation does not have a significant effect 
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(a) (b) (c) 

Figure 7.5 . Regulated cascode current source: three implementations. 



on the opamp bandwidth. However, it introduces a pole-zero doublet, which 
may slow down the settling to a remarkable extent. Thus, it was suggested that 
the regulation amplifier GBW should be larger than the closed loop bandwidth 
of the opamp. 

The frequency response and the settling of the regulated cascode opamp were 
analyzed in more detail in [165] with the aid of a symbolic circuit simulator. 

It was found that when the doublet is pushed into higher frequencies it in fact 
merges with the non-dominant pole of the opamp, forming a complex pole pair 
and a zero. Achieving fast settling with a reasonable cost requires the optimizing 
of the pole pair quality factor, frequency, and the zero frequency with respect 
to each other. Optimal settling calls for a regulation amplifier GBW, which is 
substantially higher than the opamp GBW. An opamp designed according to 
these principles has been utilized in [166]. 

When the regulation amplifier is more complex than the one in Figuie 7.5 (a) 
it has non-dominant poles, which make the frequency response and the settling 

behavior more complicated to analyze. . 

In addition to the cascode regulation other techniques for increasing the DC 
gain have been proposed as well. Gain boosting with positive feedback has 
been investigated, e.g. in [167] and [168], In [169], dynamic biasing, where 
the opamp current is decreased toward the end of the settling phase, is used 
to increase the DC gain. It exploits the fact that current reduction lowers the 

transistor g^s. which increases the DC gain. 

In a feedback configuration, due to finite opamp gain, the opamp input is 
not a perfect virtual ground but sits on the voltage -Vout/A 0 . Reducing this 
signal-dependent value by some means has the same effect as increasing opamp 

DC gain. ... , 

In [171] a replica circuit with an opamp plus feedback circuit is used to 

generate the voltage difference in the input terminals. A second replica opamp 
which has its output tied to the output of the main opamp, uses the generated 
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Figure 7.6. Miller opamp. 



voltage as its input to drive the load. The main amplifier only needs to fine- 
^une the output voltage and as a result, its effective DC gain l enhanced by 

cllits whh ! / Redback factor. This technique is also suitable for 

circuits with a resistive load. 

In SC circuits the same principle can be realized in the time domain by having 
an extra clock phase prior to the amplification phase to sample the inpuf voltage 
a capacitor put in series with the opamp input [170, 92]. 

3. Two-Stage Opamps 

In high-resolution low-voltage applications thermal noise and the opamp 

SXe o™ s en,PhaSiZed Th “ S ’ ' W0 ‘ S,a§e ° pi “" ps are oflen prefcrabk » 

3.1 Miller Opamp 

The Miller opamp is the most basic two-stage opamp. It provides rail-to-rail 
output swing and low thermal noise and can be used with supply voltages down 

whirh ' l ^ u ^ GBW 1S 9ml / Cc and the non-dominant pole g me /C L 
which is lower than in single-stage OTAs. Consequently, when a large phase 

^stage OTAs.’ ^ Ml " er ° PamP aChieVe 38 3 large bandwidth as 

Often a resistor is put in series with the compensation capacitor in order to 

CuS 

3.2 High-Gain First Stage and Rail-to-Rail Output Stage 

toyman 1 ' t2 r t ° P I )l0gy ° ff T ° nly 3 m ° der3te ° C gain ’ which is often 

too small for high resolution apphcations. The most straightforward way to 
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Figure 7. 7. Folded cascode OTA with rail-to-rail output stage. 



increase the gain is to employ a high-gain first stage. Since the first stage 

does not need to have a large output voltage swing, ,t ca " J* a 

etther a telescopie or a folded cascode. The advantages of the folded encode 

structure (Figure 7.7) are a larger input CM range and the avoidan 

shifting between the stages, while the telescopic stage (Figure 7.8) can offer 

larger bandwidth and lower thermal noise. Thus, when there is no special nee 

for 1 arae inDut CM range the latter is more attractive. 

Level shifting between the telescopic input stage and the rail-to-rail outpu 
sta-e can be avoided by taking the signal into the output stage from the pMO 
2 However, it has two disadvantages. First, the voltage across the ta 

stage cascode current sources would be only Vr + W. " 

permit the optimal biasing of the current sources for low noise. Second, the low 
transconductance of the pMOS device results in a low-frequency non-dominant 

P °The level shifting can be realized with a source follower, ft 

increase current consumption and noise. In SC circuits a 

which does not have these disadvantages, can be utilized [172], Figure 7.9 (a) 

lows 2 principle of a swttched capacitor level shifter, which ,s ; based on a 

capacitor that is periodically charged to the desired offset voltage ( bi B 2 

The parasitic capacitances, shown in Figure 7.9 (b), form a V £ 

divider with the level shift capacitor Cl. Since, as a result of the Miller effect 
the gate-drain capacitor C3 is multiplied by the gain of the output stage, the 
capacitor Cl has to be large to avoid attenuation. It can easily be s own t a 
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Figure 7.8 . Telescopic OTA with rail-to-rail output stage. 




Figure 7.9. Capacitive level shifting: (a) principle and (b) with parasitic capacitances. 



with capacitive level shifting the output stage DC voltage gain is 

V OUT = ^ 

VlN go (Cl + C 2 + C 3 + C 3 ) ’ 

which is always less than C 1 /C 3 . With a sufficiently large C l this is not a 
severe limitation. 



3.2.1 Frequency Compensation 

When there are cascode nodes in the first stage, cascode compensation, in 
which the compensation capacitor is connected to the cascode node instead of 
the first stage output, can be used instead of the standard Miller compensa- 
tion [173, 174]. It offers higher GBW and the RHP zero is at a much higher 
frequency. 
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Instead of a single non-dominant pole there is a complex pole pair and a 
zero [174]- It is well-known that, because of a complex pole pair, cascode 
compensation can result gain peaking near the unity gain frequency, increasing 
the settling time. To avoid this, the Q-value of the complex pole pair has to be 
less than l/\/2, which is satisfied when [175] 



9mc 



2 ■ C c • tn 

62 (1 + m) 2 



‘ 9m2, 



(7.7) 



where g mc is the transconductance of the cascode device, g m 2 the transconduc- 
tance of the output stage input device, C c the compensation capacitor, C 2 the 
capacitance from the first stage output to the ground, and m the ratio of the load 
capacitor to the compensation capacitor. When the load capacitor is equal to 
the compensation capacitor the equation reduces to g mc > 0-5 ■ 9m2 • Cc /C 2 . 
Typically, when the capacitor ratio Cc / C 2 is of the order of ten or more, this 
is not easy to achieve. The situation is somewhat easier if the cascode device 
has inherently larger g m than the output device (e.g. nMOS vs pMOS or BJT 
vs MOSFET). 

In [175] it is shown that a good compromise between bandwidth and gain 
peaking can be achieved by combining the standard Miller and cascode com- 
pensation. In [176] simulations show that using two compensation capacitors, 
one connected to the cascode node in the signal path and the other to the cascode 
node in the current source of a telescopic first stage, yields faster settling than 
a single capacitor connected to either node alone. 



3.2.2 Two-Stage BiCMOS Opamp 

A two-stage BiCMOS opamp designed to achieve high speed, high DC gain, 
and high SNR is shown in Figure 7.10. This opamp is utilized in the ADC 
prototype [1] described in Chapter 12. 

The telescopic input stage provides high DC gain and minimizes the number 
of noise-contributing devices. The capacitive level shifting between the stages 
avoids the use of pMOS devices in the signal path and allows for the optimal 
biasing of the first stage common mode level for high DC gain and low noise. 
The level shift capacitors are charged during one half of the clock period by 
switching one capacitor terminal to a bias voltage, generated with a dummy 
structure, and letting the first stage common-mode feedback set the voltage at the 
other terminal. The cascode devices are implemented with bipolar transistors 
with an inherently much larger g m than the MOS transistors. Consequently, the 
cascode compensation can be used without the need to be concerned about gain 
peaking. The input stage and the output stage have separate common mode 
feedback loops for two reasons: first, it is easier to achieve fast and accurate 
CMFB without a fear of instability with separate loops and, second, the first- 




Figure 7.10. High-performance BiCMOS opamp. 



stage CMFB is used to charge the level shift capacitor, which is not possible 
with a single loop since the signal path between the stages is broken during 
charging. Both the CMFB circuits are realized with standard SC circuits [177]. 
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CLOCK GENERATION 



The operation of different circuit blocks in SC circuits is synchronized with 
a clock signal. To guarantee that the signal at the output of an opamp is sampled 
in a steady state, non-overlapping clock signals are used in adjacent stages. As 
long as the sampling clock edge arrives before the one that ends the hold-phase 
in the driving stage, the circuit is very robust against small timing uncertainty at 
the clock edges. In the front-end, where the clock is used to sample a continuous 
time signal, the situation is completely different. 

Any deviation of the sampling moment from its ideal value results in an error 
voltage in the sampled signal. The error is equal to the signal change between 
these two moments. Thus, the sampling clock has to be regarded as a sensitive 
analog signal and treated accordingly. 

The sampling moment is determined by the clock signal zero crossing, which 
corresponds to the moments when the clock phase is an integer multiple of 27 t . 
Random variations in the phase, also known as phase-noise, are a source of 
timing errors. The phase-noise is usually specified with the single-sideband 
noise spectral density £(/), which, being a frequency domain parameter, does 
not always give a good insight into the timing error. A related time domain 
parameter, jitter, is defined as an rms error between a reference time point and 
the clock signal zero crossing. Cycle-to-cycle jitter is often of interest and the 
previous zero crossing is taken as the reference point. 

Besides random errors, the signals present on the chip or the circuit board 
can couple to the clock. 

1. Jitter 

The instantaneous voltage error caused by jitter A t can be approximated as 
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Figure 8. 1. Typical oscillator phase noise spectrum. 



V n & A t — — = A t • A s 2irf cos(27r/£), (8.1) 

dr 

where Vs(t) is the signal waveform, which in the second form is assumed to 
be a sinusoid with amplitude As and frequency /. The rms voltage error can 
be obtained by squaring (8.1) and integrating it over the signal period, yielding 

%= At- A s V2nf. (8.2) 

Using this, the signal -to-noise ratio can be calculated to be 

SNR = -20 • log (27 t/A*) . (8.3) 

It can be seen that the SNR is independent of signal amplitude and that it 
decreases as the signal frequency increases. The maximum allowed jitter for a 
certain target SNR is given by 



At ~ SNR+16 * ( 8 ’ 4 ) 

/ • 10 20 

Knowing that the quantization noise power in an ADC is LSB 2 / 12 and allowing 
the error power generated by jitter be equally large (in the case of full-scale sine 
wave input) the jitter requirement can be written as 



At 



1 

/ • 7.7 



(8.5) 



If only 1 dB SNR degradation is permitted, the factor 7.7 increases to 15.1. 



1.1 Jitter Sources 

The clock signal is usually taken from a crystal oscillator or from the output 
of a phase locked loop (PLL), locked to a crystal reference. A typical oscillator 
phase noise spectrum is shown in Figure 8.1 [178]. From it, three different 
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regions can be identified; in the vicinity of the carrier (small A/) the noise 
power is proportional to 1/A/ 3 . When going further in frequency the slope 
changes to 1/A f 2 and eventually the noise spectrum becomes flat. The noise 
power in the non-flat region is inversely proportional to the oscillator Q-value 
and power dissipation. The crystals have very large Q’s and thus small phase 
noise and jitter. 

In a PLL the phase noise of the voltage-controlled oscillator (VCO) is at- 
tenuated by the loop filter; the wider the filter bandwidth, the faster the loop 
correcting action and the resultant attenuation. On the other hand, the effect 
on the reference phase noise is just the opposite; smaller bandwidth gives more 
attenuation. Typically, however, the VCO phase noise dominates. 

When sampling a sinusoidal input signal with a clock signal whose noise 
spectrum has the shape shown in Figure 8.1, the resultant signal spectrum will 
have a similar shape. Thus, in general, close-in-carrier phase noise spreads the 
signal spectrum, while white wide-band phase noise raises the noise floor of 
the sampled signal. Many applications set distinct requirements for them. For 
example, in a narrow channel communication system the spreading of a strong 
interferer because of phase noise may mask a nearby weak channel, leading to 
strict specifications for close-in phase noise. 

The jitter, being a single number, does not contain as much information as 
the phase noise and thus is not always a sufficient figure of merit on its own. 
The thermal noise originating after the oscillator, e.g. in the buffering, does 
not get accumulated in the phase and thus the resulting phase noise has a white 
spectrum. Consequently, it can be fully described by the jitter. Similarly, the 
jitter of a crystal oscillator is also typically dominated by white phase noise. 

There are not many ways to reduce the effects of jitter. Oversampling can 
be used to spread the white noise arising from jitter over a wider frequency 
range, which allows the SNR to be improved with discrete time (analog [179] 
or digital) filtering and decimation. Increasing the oscillator frequency also 
tends to reduce the absolute jitter, since for a constant single-side -band noise, 
the jitter is inversely proportional to the oscillator frequency. In addition, it is 
important to avoid unnecessary buffering of the sampling clock. 

1.2 Inverter Buffer 

On-chip buffering for the clock signal is needed primarily for two reasons. 
First, the capacitive loading in the clock lines may be high, requiring a buffer 
to keep the clock edges sharp. Second, the incoming clock cannot usually be 
directly utilized, but it is used as an input for a clock generator, which produces 
the non-overlapping clock signals. After the clock generator, the signals are 
buffered to achieve the required driving capacity. The buffering is the critical 
place where a clock signal, which originally has a low jitter can easily be 
contaminated. 




1 1 2 CIRCUIT TECHNIQUES FOR LOW- VOLTAGE AND HIGH-SPEED ADCS 




Figure 8.2. Clock waveform and corresponding impulse sensitivity function (ISF). 



The simplest buffer is the CMOS inverter. How its voltage noise turns into 
jitter is investigated next. It is clear that with a large amplitude input signal the 
inverter is a highly nonlinear circuit, and thus it has to be carefully considered, 
whether the linear noise models are applicable. In conjunction with the ring 
oscillators [180], the transformation of thermal noise into jitter is modeled with 
a linear time-varying model. It is based on the observation that the sensitivity 
of the zero crossing moment to a noise impulse varies over time (or phase), 
being largest when the inverter is changing its stage and becoming practically 
zero when the output voltage is saturated to the high or low logic level. The 
phenomenon is modeled with a time- varying dimensionless impulse sensitivity 
function (ISF) T, which is illustrated in Figure 8.2. 

Here, the analysis is simplified by assuming that the inverter output slews 
during the whole transient and that the impulse sensitivity function is a square 
pulse with the same length as the transient. Consequently, a linear time-invariant 
model is used. 

The thermal noise is modeled with a current source connected to the output 
and having the value L = s/4kT'-/g ri A.f- The load of the inverter is a parallel 
combination of the load capacitance and the output resistance of the inverter: 
Z L - (l/r 0 + sCtf 1 . The noise current is transformed to voltage in the 
output impedance, resulting in 




which is obtained by integrating the spectral density from the zero frequency 
to infinity. The amount of time which the zero crossing moves as a result of 
the noise voltage is inversely proportional to the slope of the signal. Since the 
inverter is slewing, the slope is equal to the slew rate, resulting in the following 
equation for the jitter: 




2kTjroCL 

f IiC ox (Vdd-V t ) 3 ’ 



(8.7) 
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Figure 8.3. Input signal coupling to the clock signal and its effect on the sampling moment. 



where SR (= Id /Cl) is the slew rate. The last form of the equation is obtained 
using the square law model for the transistor current. In [181] an analysis of a 
differential delay cell yielded similar results. 

Probably the main finding in this jitter equation is its dependence on inverter 
DC gain (g m r 0 ), which is due to the fact that at low frequencies the noise 
current is transformed into voltage in the output resistance. Since the signal 
slope is determined by the slew rate, not the gain, there is no reason for having a 
large output resistance, which suggests that even lowering it with an additional 
parallel resistor might be useful. Another, fairly unsurprising finding is the fact 
that increasing the current reduces the jitter. Furthermore, the last form of the 
equation shows clearly how advantageous it is to have a large supply voltage. 

2. Signal Crosstalk 

One strong interfering signal, typically present both at the board and the 
chip level, is the analog input signal itself. Capacitive or inductive coupling 
between it and the clock can happen between the traces on the PCB, the package 
pins, or the bonding wires. On the chip, coupling through the substrate or via 
modulation of the supply voltage is possible. Fully differential circuitry reduces 
both the coupling and the sensitivity to it and thus it should be used for both 
the input signal and the clock, if possible. 

If the clock signal couples to the input signal, the sampling will alias the 
fundamental clock frequency and all its harmonics to the DC. The resulting DC 
offset is generally not harmful. 

On the other hand, if the input signal couples to the clock, the sampling 
produces a spurious signal at twice the signal frequency, which will be shown 
next. 

Let the ideal clock waveform be Vc LK,id and the signal Vj n • When the signal 
couples to the clock circuitry the resulting sampling clock is VcLK,id + B(f) • 
Vjn, where B(f ) is a frequency-dependent coupling factor. The situation in 
the vicinity of the clock waveform zero crossing is illustrated in Figure 8.3. 
Assuming that the ideal clock signal is linear with slope a, the signal after the 
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coupling is: 



Vclk — a (t ~~ n T) + B(f) ■ VlN(t), 



( 8 . 8 ) 



where nT is the ideal zero crossing moment. Solving t and assuming that the 
signal does not change much between the ideal and the actual zero crossings 



tanT _ B U)-V,»(«T) =uT _ &t 

a 



It can be seen that the situation is analogous to the one, studied in Chapter 6 
in conjunction with signal-dependent switch turn-off moment. Using the result 
derived there, the relative level of the spur at two times the signal frequency 
can be written as 



HD 2 = -20 ■ log 






(8.10) 



where A s is the amplitude and f s the frequency of the input signal. It can be 
seen that level of the spur rises with signal frequency and amplitude, and it can 
be reduced by reducing the coupling and by making the clock waveform steeper. 
It should be noted that the clock slope of interest is where the coupling, not the 
sampling, happens. In addition, since the spur is rather a result of mixing than 
of amplitude distortion, fully differential circuitry does not help once the clock 
has been contaminated. 



3. Circuits 

3.1 Standard Non-overlapping Clock Generator 

The clock generator for producing non-overlapping clock signals can be 
realized with a simple circuit constructed of logic gates. Such a circuit is shown 
in Figure 8.4. It is based on the idea that the falling edge of the input clock 
passes immediately through the NAND gate NA1, while the rising edge has first 
to propagate through the other NAND gate and the cascaded delay element. The 
resulted non-overlapping signals elk a and dkg have a non-overlapping time 
equal to the sum of the delays at the NAND gate and the delay element. The 
delay element is usually realized with an even-numbered chain of inverters. 

The main advantage of this circuit is its simplicity. At least a part of the 
buffering of output signals can be included in the delay elements, making the 
circuit quite robust. On the other hand, the non-overlap time often becomes 
larger than necessary because of the buffering included and the margin added 
to accommodate the process and temperature variations. The resulting speed 
penalty is emphasized in high clock rate circuits. Furthermore, the duty cycle 
of the generated clock signals is inherited from the input clock, requiring it to 
be close to 50%. 
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Figure 8.4. A simple clock generator. 




Figure 8.5. Delay locked loop (DLL). 



3.2 DLL-Based Clock Generator 

A process and temperature independent non-overlap time can be realized 
using a delay locked loop (DLL). Figure 8.5 shows a simplified block diagram 
of a DLL. It consists of a voltage-controlled delay line, a phase detector (PD), 
a charge pump (CP), and a loop filter. Negative feedback is utilized to adjust 
the variable delay to be equal to the clock period (or its half). 

In the feedback loop the phase detector generates up and down pulses, which 
are proportional to the phase difference between the input and the output clock. 
The charge pump performs a D/A conversion on the pulses and the low pass 
filter averages them over some time. The filter output voltage controls the delay 
line, eventually forcing the phase difference to zero. 

The delay line consists of several unit elements, which divide the total delay 
into uniformly spaced sub-intervals, the number of which is equal to the number 
of elements. The output signals of the individual elements are available outside 
the delay line. Figure 8.6 shows how these signals can be utilized to form non- 
overlapping clock signals. The rising edges of four signals are combined with 
the logic AND operation, resulting in two new signals, whose non-overlap time 
is equal to the unit delay and hence a fixed portion of the clock period. 

The DLL-based clock generator is clearly a more complex circuit than the 
circuit based on NAND gates. Consequently, it consumes more power and 
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Figure 8.6 . Generation of non-overlapping clock signals using edge combining. 

area and requires greater design effort. Thus, its benefits have to be carefully 
weighed against the disadvantages when the design decision is being made. 

The utilization of a DLL can be more versatile than just constructing non- 
overlapping signals. For example, in the time-interleaved pipelined ADC pre- 
sented in [91] the need for a full-speed clock signal is avoided by utilizing a 
DLL to generate all the required clock phases for the parallel channels. A DLL 
can be used to generate signals with virtually any duty cycle and even as a 
frequency multiplier. 

When very low jitter is essential, it is advantageous not to put the sampling 
clock through a clock generator, but take it as early after it arrives on the chip 
as possible. A DLL can be employed to construct the required complementary 
non-overlapping signal, as is done in the IF-sampling ADC prototype described 
in Chapter 12. 



Chapter 9 

DOUBLE-SAMPLING 



The clock rate of the switched capacitor circuits is limited by the bandwidth 
of the opamp; thus, in order to achieve a high speed it is essential to exploit the 
opamp efficiently. This chapter introduces a technique to double the sampling 
rate of the switched capacitor circuits without a need to increase the speed of 
the opamp. This technique, called double-sampling, was first introduced in 
[182]. It has been applied in various SC circuits such as filters, AE-modulators 
[183, 184], pipelined ADCs [95, 5], and S/H circuits [11, 10, 9, 185]. 

This chapter begins with the introduction of double-sampling technique, after 
which its nonidealities are analyzed. A circuit structure for eliminating one of 
those, the timing skew, is proposed in the last section. 

1. Principle 

An SC circuit can be divided into blocks, each comprising an opamp and a 
set of switches and capacitors. The SC integrator shown in Figure 9.1 can be 
used as an example block. The integrator operates in two phases; in the first 




Figure 9. L A switched capacitor integrator. 





1 1 8 CIRCUIT TECHNIQUES FOR LOW-VOLTAGE AND HIGH-SPEED ADCS 




Figure 9.2. A double-sampled SC integrator. 



phase the circuit samples its input, which is usually the output of some other 
block, in the capacitor C$. The second phase can be called the integration 
phase or amplification phase. During it, the circuit performs a charge transfer 
from the sampling capacitor to the integration capacitor Cj with the aid of the 
virtual ground provided by the opamp. 

Since the output of the circuit has to be fully settled by the end of the inte- 
gration phase, it can already be sampled by the following circuit block in this 
clock phase. If this is done, the opamp is not needed in the sampling phase. 
Sometimes, the sampling phase is used to auto-zero the amplifier, i.e. to cancel 
its input offset voltage, but this is not necessary in many applications. Another 
possible way to exploit the opamp’s idle phase is to duplicate the sampling 
circuitry and operate the two sampling circuits in opposite clock phases. In 
this way the opamp and the integration capacitance are shared between the two 
sampling circuits. The sampling rate of the resulted double-sampled circuit, 
shown in Figure 9.2, is twice that of the original circuit. There is, however, only 
a minor increase in power consumption, since it is dominated by the opamp, 
which typically uses the class A architecture and hence consumes power also 
when idle. 

Double-sampling can also be applied to the S/H circuit shown in Figure 3.9, 
resulting in the circuit shown in Figure 9.3. During the first clock phase the 
input is sampled in the capacitor and the capacitor Cg 2 is connected to 
the feedback loop around the amplifier. In the next clock phase the roles of 
the capacitors are changed C§ 1 being in hold mode and C§ 2 in sample mode. 
The use of double-sampling in an S/H circuit has been demonstrated with the 
prototype presented in Chapter 1 2. 

2. Nonidealities 

Double-sampling introduces some nonidealities not present in conventional 
SC circuits. Most of them arise from the mismatch between the two parallel 
circuits and are basically similar to the nonidealities in parallel ADCs, which 
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Figure 9.3. A double-sampled S/H circuit. 



have been analyzed e.g. in [1 19]. A general analysis concerning the timing of 
parallel sampling systems is given in [186]. Nonidealities in double-sampled 
SC filters have been investigated in [187]; however, not all the results are directly 
applicable to double-sampled S/H circuits. Thus, in the following sections the 
nonidealities of double-sampled circuits are analyzed, taking an approach better 
suited to S/H circuits and ADCs. 



2.1 Memory effect 

Due to the finite gain of the opamp a fraction of the previous sample remains 
stored in the parasitic capacitance in the input of the amplifier [94, 184]. Using 
the z-transform the voltage gain of the circuit in Figure 9.3 can be written as 



Vqut 

Vin 



III/ Cs + Cjn+Cp l \ _ Cj n , 
; ACs ' 



(9.1) 



where A is the DC gain of the opamp, Cs the sampling capacitor, Q n the opamp 
input capacitance, and C p \ the parasitic capacitance at node nl. The equation 
differs from the one for the conventional circuit (shown in Figure 3.9) in that it 
has an extra term, proportional to z~ l , in the denominator. The equation shows 
that double-sampling, together with the finite opamp gain and the parasitic 
capacitances, adds a low-pass filtering effect. In the worst case (at the Nyquist 
frequency) the additional error is equal to the error caused by the opamp input 
capacitance in a conventional circuit. 
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Figure 9.4. Time domain representation of a double-sampled signal in the presence of channel 
offset. 

2.2 Offset 

A static DC offset between the two signal paths can be considered as a 
constant value added to every other sample. In the time domain this can be 
written as 

OO OO 

y(t) = ^ x(t) • S(t — nT) + ^ £ ■ S(t — n2T — T), (9.2) 

n—— oo n=— oo 

where £ is the magnitude of the offset and 5(t) the Dirac’s delta function. The 
time domain signal is illustrated in Figure 9.4. 

The frequency domain representation can be obtained with the Fourier trans- 
form, which results in 

YU)= E X(f)-Hf-j)- E (9-3) 

n——oo oo 

The equivalent magnitude spectrum is shown in Figure 9.5, where /$ = l/T 
is the clock frequency of the whole system, i.e. it is twice the clock frequency 
of the individual sampling circuits. The result obtained indicates that the offset 
between the two parallel signal paths results in tones at multiples of /s/2. 

In practice, large channel offset is not likely to occur in double-sampled 
circuits, since the opamp, which is the main source of offset, is common for 
both the signal paths. 

2.3 Gain Error 

If there is a gain mismatch between the parallel circuits, the sample sequences 
they produce have different amplitudes. In the time domain this can be written 
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Figure 9.5. Frequency domain representation of a double-sampled signal in the presence of 
channel offset. 



y(t) = 1 • x{t) ■ S(t - n2T) 

n=— oo 

oo 

+(1 - a)- Yx x(t) ■ S(t - n2T - T) 

n=— oo 
oo 

= ^2 x (t) * — nT) 



£ 5(t-nT)-a- 6(t - n2T — T) , (9.5) 



where a is the normalized gain mismatch. Figure 9.6 shows that the gain 
mismatch in the time domain is equivalent to multiplying the ideal sample 
sequence by a sequence of two alternating constant impulses. 

In the frequency domain the multiplication corresponds to the convolution 
and thus the Fourier transform gives 



Y(f)= £ *(/-£)-«• £ (-i) n -*(/ 



(9.6) 
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Figure 9.6. Time domain presentation of a double-sampled signal in the presence of gain 
mismatch. 



This is illustrated in Figure 9.7, which reveals that the consequence of gain 
mismatch is parasitic sidebands around the multiples of /s/2. If the signal 
bandwidth exceeds /s/4 the sidebands alias to the signal band, degrading the 
signal-to-noise ratio. Even if the spectra do not overlap, filtering is needed to 
remove the sidebands. 

Gain mismatch originating from capacitor mismatch is a severe problem in 
some double-sampled circuits. For example, in AS-modulators the mismatch 
down converts the shaped noise energy around /s/2 to the baseband [183], 
However, in the S/H circuit shown in Figure 9.3 the gain is always one and 
independent of capacitor ratios, and thus gain mismatch is not a pioblem. 

2.4 Timing Skew 

There can be a constant timing skew in the clock signals of the two parallel 
circuits. This is illustrated in Figures 9.8 (a) and (b), where the sample sequences 
taken by each circuit are shown. The sequence y' 2 {t) has a constant timing error 
AT, and thus the sequences can be written 

00 

y[(t) = £ W ~ n2T ) ^-7) 

n——oo 

i oo . _ 

1 ,7W27T 

= 2T £ *(*) • e 2T 



n= — oo 



(9.8) 
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Figure 9.7. Frequency domain representation of a double-sampled signal in the presence of 
gain mismatch. 




y 2(6 



00 

Y x(t)-8(t-n2T-T-AT) 

71— — OC 



1 

2T 



E x ( t )' e 



jn'2-K{t~T— AT) 
2 T 



(9.9) 

(9.10) 



The output of the circuit y'(t) is the sum of y[ and y' 2 . In practical circuits 
it is held between the samples. If the subsequent signal processing is done in 
the discrete time domain, the held y'(t) is resampled, resulting in a sequence 
of uniformly spaced samples, shown in Figure 9.8 (d). This sequence can be 
written as 



y(t) 



OO 

Y x{t) ■ 5{t - 2nT) 

Ti— — OO 
OO 

+ Y x(t + AT) ■ 8(t - 2nT - T) 

71— — OC 



E 



x(t) • e 



jn2nt 
2 T 



n=—oo 



i 

2T 



(9.11) 
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Figure 9.8. Time domain representation of a double-sampled signal in the presence of timing 
skew. 








x(t + AT) 



jn2ir(t - T ) 

6 2T ' 



(9.12) 



The resampling thus corrects the sample misalignment but retains the incorrect 
sample values. 

The frequency domain representation for the first sample sequence yi{t) = 
y[ ( t ) (the sequence without the prime denotes the signal after the resampling) 

iS 1 oo 

*i(/) = 2T E ~ 2^) (9-13) 

n=—oo 

and for the second resampled signal 
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Figure 9.9. Frequency domain representation of a double-sampled signal in the presence of 
timing skew. 



Using the time shift theorem, the Fourier transform in (9.14) yields 

1 ^ n o £ \ jn2iv(T+AT) 

W) = ^ £ X(/-^)-e> 2 " iT -e Sr- 1 . 



The output signal is the sum of Y\ and Y*i\ thus. 



1 oo 

Y(f) = 2. •£ XU - i) . [l + e->"' . ^TU-mrr^ 



These equations can be understood with the help of Figure 9.9. There, 
the magnitude and the phase of the signal spectra are represented in three 
dimensions — the phase in the polar coordinates and the frequency axis in per- 
pendicular to the phase plane. To make drawing easier, the base band signal 
spectra are represented as groups of impulses. The uppermost spectrum is 
Y\ (/), which is the input signal sampled at the rate of fs/ 2, and thus the base- 
band spectrum repeats at fs / 2 intervals. The spectrum in the middle is Y 2 {f) 
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which, again, is the input signal sampled at /s/2. The sampling moment, how- 
ever, is ideally shifted by half a clock period compared to y\ (y ) . The half period 
time domain shift causes the phase of the spectral images at odd multiples of 
/s/2 to be rotated through 1 80 degrees. The samples, however, do not represent 
the input signal values at nT + T/2, but the time AT later. This rotates the 
edges of the spectral images in the phase plane. 

Due to this bending, the sum of these two spectra (the bottom sequence) has 
some remnants of the spectral images at odd multiples of /s/2, which would 
ideally (AT = 0) be canceled out. If the bandwidth of the input signal is greater 
than f s /4, these remains alias in the signal band. The wider the bandwidth, the 
larger the error signal, since the phase rotation is proportional to the frequency 
offset from the center of the image. Considering the time domain signal, this 
sounds reasonable, since the input signal change between the ideal and the actual 
sampling moment (i.e. the error) gets larger as the input frequency increases. 

When the input is a sinusoidal signal (frequency /), the error is a tone at the 
frequency f s /2-f. When the magnitude of the error image is small compared 
to the fundamental, it can be approximated with 201 og(| 7 rAT/|) dBc, which 
is obtained from (9.16) using the small angle approximation. 

The sources of the timing error are device mismatches in the clock generation 
circuit and uneven clock line capacitances. If the clock signals for the parallel 
circuits are generated using both the rising and falling edges of an external 
half-speed clock, the deviation of its duty cycle from 50% is seen as the timing 
error. These errors can be minimized but not totally eliminated by a careful 
layout design and by using a full-speed external clock. 

3. Skew-Insensitive Circuit 

To overcome the timing skew problem a modification, which makes the 
double- sampled circuits insensitive to the timing errors, has been proposed by 
the author in [9]. There, the idea is to perform the sampling with a single switch 
rather than two parallel switches. In Figure 9.10 the technique is applied in a 
double-sampled S/H circuit. 

The sampling is performed by the switch So clocked with the signal (j>s 
(Figure 9.1 1). The switches Si and S 2 act as a multiplexer, which controls the 
alternate use of the common sampling switch in the parallel circuits. When the 
half circuit with the sampling capacitor Cgj is in tracking mode switches So, Si, 
S2, and S 3 are on and switches S 5 and S 7 off. The sample is taken by applying 
a short zero pulse to switch So, during which switch Si is turned off, followed 
by S 3 being turned off. Next, switches S 5 and S 7 are closed connecting the 
capacitor Cgj to the feedback loop around the amplifier. 

The time gap between turning off S 0 and Si must be quite short, since the 
voltage at the floating node nO changes along with the input voltage. The 
voltage change causes a part of the sampled signal charge to be distributed to 
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Figure 9.10. Proposed timing skew-insensitive double-sampled S/H circuit. 
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Figure 9.1 1. Timing of the skew-insensitive S/H circuit. 

the parasitic capacitance in that node. When switch Si is turned off the charge 
in nO is isolated, distorting the total sampled charge. In addition to making 
the timing gap small, minimizing the parasitic capacitance in nO also helps to 
diminish this error source. 

Skew removal requires of the sampling pulse that it goes to zero before the 
multiplexer switch (Si or S 2 ) is turned off and that it remains zero until the 
turn-off has been completed. The pulse can be longer — even half of the clock 
period long — but the acquisition time of the sampling circuit becomes shorter 
as the pulse length is increased. However, if the circuit can track the input 
signal in half a clock period, a full-rate clock with a 50% duty cycle can be 
used, which simplifies the design of the clock generator. 

This skew removal technique has been tested with the S/H circuit proto- 
type presented in Chapter 12. Another similar technique has been developed 
(independently from the author) by Gustavsson and Tan [188]. 




Chapter 10 



SWITCHED OPAMP TECHNIQUE 



As discussed in Chapter 2, the inadequate switch transistor gate overdrive 
is the main obstacle to the low-voltage operation of standard SC circuits. In 
Chapter 6 the bootstrapped switches were investigated as a solution for this 
overdrive problem. Their biggest disadvantage is the fact that even the simplest 
switch circuits may have more than ten transistors, which increases the area 
and complexity of a SC circuit, which contains dozens of such switches. The 
idea in the switched opamp (SO) technique is not to modify the switch but the 
circuit itself, so that the switches always have the maximum possible overdrive, 
which is Vqd — Vt. 

1. Operation Principle 

Let us look at the integrator implemented in the standard SC technique, shown 
in Figure 10.1. There, two types of switches can be identified: series switches 
(Si and S5), which pass signals, the level of which varies over the whole voltage 
range, and shunt switches (all the rest), whose other terminal is connected to 
the analog ground potential. The switch S4, although a series switch in a sense, 
falls into the latter category, since it operates against the virtual ground. 

It is always possible to select a ground voltage equal to V55 (or VddX 
which provides the maximum possible voltage overdrive for nMOS (or pMOS) 
switches. Similarly, the level of the virtual ground can be set freely, but it 
affects, of course, the design of the opamp input stage. The series switches 
are connected to the opamp output, where the signal common mode level is set 
to Vdd/ 2 to maximize the signal range. Consequently, the overdrive voltage, 
regardless of the switch type, is only V DD /2~ Vt in the worst case. 

The main idea in the switched opamp technique is to eliminate these series 
switches. The function of this type of switch, e.g. S5 in Figure 10.1, is to 
disconnect the opamp from the next stage sampling capacitor Cs 2 in phase 0, 
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when the capacitor’s left terminal is shorted to the ground. If the opamp was not 
disconnected, there would be a race between it and switch S6. In an SO circuit 
this condition is avoided by turning the opamp output into a high impedance 
state during phase <f>. Thus, like a tri-state logic circuit, the opamp does not 
resist the pulling of its output to the ground and so there is no need for a series 
switch. The SO implementation of the integrator, where the series switches are 
eliminated and the opamp made switchable, is shown in Figure 10.2. 

The SO technique was first introduced in [189] and [34] and further developed 
in [190] by making the circuit fully differential and separating the input and 
output common mode levels. Reported SO circuits include filters [34, 191], 
AS modulators [192, 193], and the author’s two pipelined ADCs [8, 3], 

The rest of this chapter covers the implementation issues and the design of a 
switchable opamp. Interfacing the SO circuit to the outside world is discussed 
as well. 



Figure 10.3. DC voltage correction with an extra capacitor C^. 

2. Compensating for Common Mode Voltage Step 

In SO circuits the signal DC level, or common mode level in fully differential 
circuits, is not the same in the two clock phases. The consequences of this and 
a method to avoid them are studied next. 

Let us first ignore the capacitor C^q in the SC amplifier of Figure 10.3 and 
assume that the signal AC voltage is zero. In the sampling phase (0 = 1) Vin is 
Vdd/ 2, Vout is 0 V, and node nl is shorted to Vdd- In the amplification phase 
(4> = 0) the situation in the input and the output is reversed; now Vin = 0 and 
Vout = Vdd/ 2. When the capacitors C<§ and Cp are equal, node nl stays in 
balance, like the middle of a teeter board. 

Usually, the capacitors are not equal, which results in an error, which is seen 
as an offset in single-ended circuits and a change in opamp input CM level in 
fully differential circuits. In [190] an extra switched capacitor (capacitor Cp>c 
in the figure), which injects a constant correction charge into the node nl every 
clock cycle, was proposed to overcome the problem. Setting the capacitor value 
to (Cs - Cp )/ 2 (assuming Cs > Cp) balances the voltage. In an integrator, 
the feedback capacitor is not reset and as a result the required Cdc is Cs/ 2. 

3. Preventing Charge Leakage from Virtual Ground 

The pn junctions at the drain and source of a MOS switch transistor form 
reverse-biased diodes to the transistor bulk, which is normally connected to Vss 
in nMOS transistors and to Vd d in pMOS devices. If the signal voltage exceeds 
Vdd in a node where a pMOS switch is connected, the diode becomes forward 
biased and may leak some of the charge stored in that node. A similar situation 
occurs with an nMOS device when the voltage goes below Vss- Typically, a 
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Figure 10.4. Switch transistor junction diode may cause charge leakage. 



momentary forward bias of less than 500 mV is not harmful with silicon diodes. 

A node where the voltage may peak beyond the allowed limits and where 
the charge conservation is of the utmost importance, is the virtual ground in the 
opamp input. The situation in an SO amplifier which has the virtual ground 
(nl) set to V DD is illustrated in Figure 10.4. The junction diode of the pMOS 
switch S 2 is shown between node nl and Vdd* 

In Figure 10.5 (a) the voltages in the input, the output, and the virtual ground 
are shown. The dotted curves represent the response for the minimum and 
maximum input signal voltages. The curves shown are for a circuit where 
C s and Cp are equal. The switches and the opamp output stage are operated 
simultaneously. Consequently, due to the opamp’s finite bandwidth, slew-rate, 
and the speed of the common mode feedback, the output voltage changes more 
slowly than the input voltage. As a result, the voltage at node nl peaks safely 
downward. Charge injection from the switch S 2 causes upward peaking at the 
very beginning of the transient. 

When the capacitor C DC is present, it pushes the virtual ground in the op- 
posite direction as Cg. The integrator is an extreme case, where the effect of 
C DC on the DC voltage is maximal. Assuming that the switched capacitors 
have equal time constants, the signal voltage may raise the virtual ground well 
above Vdd before the opamp responds and nullifies the voltage difference be- 
tween its input terminals. The situation can be somewhat alleviated by ensuring 
that the voltage step caused by C DC comes later than the one caused by C s , 
either by making its time constant larger (by controlling switch on-resistance) 
or delaying the clock phase controlling the switching [19 1], It has to be noted. 





Figure 10.5. Voltages at the circuit input (Vi n ), opamp output (Vout), and the virtual ground 
(Vvg) during a transient. 



however, that in very low-voltage circuits the signal voltages are also small, not 
easily causing spikes in excess the permissible ~500 mV. 

In an SO amplifier it is possible to set the output reset level and the virtual 
ground at the same potential (both to Vdd or Vss)- In an integrator forward 
biasing the diodes at opamp input during the sampling phase is not easily avoid- 
able (it can, however, be done with yet another set of switched capacitors [194j), 
and thus the levels are normally different [193]. Figure 10.5 (b) shows again the 
voltages in the input, the output, and the virtual ground in an SO amplifier. Now 
charge leakage is a more serious risk, since the input step pushes the virtual 
ground in a hazardous direction. Delaying the switching of the input capacitors 
compared to and the opamp makes this configuration also usable as shown 
with dotted curves in the figure. 

4. Speed 

Achieving clock rates with SO circuits as high as with traditional SC circuits 
is not possible. There are several reasons for this, the most important of which 
is the finite opamp recovery time from the off-state. In addition, the limitation 
that the capacitors are permanently connected to the output of an opamp and the 
added extra capacitors lead, in many cases, to circuit topologies with a lower 
feedback factor than in SC realizations. 

Opamp recovery time depends on the settling of the internal nodes and charg- 
ing the voltage step of Vdd / 2 to the output capacitance. The internal settling 
can be improved by switching only the opamp output stage, disconnecting the 
compensation capacitors during the off-phase, and performing the switching 
with series switches between a current source transistor source and the power 
supply rather than a shunt switch at the current source transistor gate. In a 
differential SO circuit, switching of the opamp causes Vdd/ 2 common mode 
voltage step at the output. To minimize the recovery time the common mode 
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for a 1.5-b pipeline stage. 
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Figure 10. 7. SO MDAC for a 1 ,5-b pipeline stage. 



feedback circuit has to be fast and the opamp slew rate toward the mid-supply 
level has to be high. 

The penalty imposed by permanently-connected capacitors can be clearly 
seen when an SO MDAC is compared to its SC counterpart. Figure 10.6 shows 
the SC circuit typically employed in pipelined ADCs using the 1.5 bits/stage 
architecture. The input voltage is sampled in two equally-sized capacitors. In 
the hold phase one capacitor is used as a feedback capacitor while the other is 
connected to a reference voltage. 

The SO implementation of the same circuit [3] is shown in Figure 10.7. 
Since the sampling capacitor cannot be disconnected from the previous stage, 
a separate feedback capacitor and capacitors for D/A conversion (subtracting 
the reference voltage) are needed. 

The settling speed of the circuit is determined, besides by opamp bandwidth, 
also by the feedback factor of the circuit, which is given by the ratio of the feed- 
back capacitor to the total capacitance. In the traditional circuit it is 1/2, while 
the larger number of capacitors in the SO circuit lowers it to 1/4. Expanding the 
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analysis to a general MDAC, suitable for a A;-bit pipeline stage, whose gain is 
G (= 2 fc ), yields a feedback factor of 1 /G for the SC circuit and l/(3G/2 + 1) 
for the SO one. 

One additional speed penalty of the SO technique is the lack of double- 
sampling capability. Providing a valid output signal in both clock phases can, 
however, be accomplished by adding to the opamp a second parallel output 
stage, as the example, reported in [195], with a pseudo-2-path filter shows. 

5. Power Supply Rejection and Noise 

How switching the capacitors to Vpo and Vss affects the power supply 
rejection ratio is an unavoidable question with SO circuits. It should be noted 
that in principle the voltages where the capacitors are connected need only to 
be at the same level as the supply voltages; no direct on-chip connection is 
required. 

If the virtual ground level and the output reset level are equal (both Vdd or 
Vss), this voltage level can be considered as a signal ground, against which 
all the signals are referred. For signal capacitors the situation is not different 
from that in traditional SC circuits. The extra capacitor Cp)^, in contrast, is 
connected to the opposite supply rail, and the differential noise between the 
rails is coupled to the signal voltage, only attenuated by the capacitor ratio, 
which is typically of the order of 6 dB [191]. 

Alternatively, if the virtual ground level, which is also the level of the sam- 
pling ground, and the output reset level are opposites (one Vss and the other 
Vdl>, or vice versa), the situation is worse. Now the signal is sampled a against 
different voltage than the one against which the signal is referred in hold mode. 
As a result, the differential noise between these levels is directly summed to the 
signal voltage, i.e. the rejection is 0 dB. 

Making the circuit fully differential ideally blocks all the noise that has been 
discussed. The rejection ratio is, of course, finite because of capacitor mismatch, 
but should typically be at least 50 dB. Thus, even if the reference levels are the 
analog Vod and Vss, noise coupling through other routes is likely to dominate. 
In practice, all but the very first reported SO circuits have been fully differential 
implementations. 

The extra switched capacitor C^q adds its contribution to the thermal noise 
[193]. When its value is Cs/2, the increase in noise level is 1.8 dB, which can 
be compensated for by increasing the sampling capacitor by 50%. 

6. Switchable Opamps 

The switchable opamp is not essentially different from a conventional low- 
voltage opamp. In many cases an opamp can be made switchable simply by 
adding one or two transistors. Thus, the most important requirement for the 
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opamp is a low-voltage capability. From the switch perspective the SO tech- 
nique requires a supply voltage which is only Vp plus some overdrive. The same 
is expected from the opamp. A differential pair can operate with a V T + 2 V dsa t 
supply, and hence it complies with the requirement. The other structures em- 
ployed in the opamp need to be chosen in such a way that this limit is not 
significantly exceeded. 

The choice of the common mode potential of the virtual ground and the type 
of the opamp input pair are linked; setting the CM level to Vdd requires nMOS 
input transistors, while setting it to Vss calls for pMOS devices. If low 1/ / 
noise is required, a pMOS input pair is preferred, while nMOS devices provide 
larger g m leading to wider bandwidth and lower thermal noise. On the other 
hand, the choice of the input CM level also determines the type of the switches 
connected to the virtual ground, which has an effect on their on-resistance and 
parasitic capacitances. 

The role of the common mode feedback circuit is emphasized in SO circuits 
for two reasons. First, the low voltage sets some limitations on applicable 
circuit structures and second, the change in output common mode level between 
the on and off phases requires high-speed common mode feedback capable of 
supplying high common mode slewing currents. 

Finally, the opamp, or at least its output stage, has to be switchable. The 
most important requirement for the switching method is a fast recovery from 
the off state. 

Next, some opamp circuits from the literature and three new proposals are 
studied in more detail. 

6.1 Circuits from the Literature 

6.1.1 Steyaert’s Switchable Opamp 

The opamp in the original paper [34] by Steyaert and Crols is derived from the 
classic Miller topology and shown in Figure 10.8. The switching is implemented 
with two switches, one shunting the gates of pMOS current source transistors 
to Vdd and the other cutting off the current path between the output stage 
transistor M7 and Vss- The circuit is single-ended and hence rather a proof 
of concept than a practical building block. Nevertheless, besides the biquad in 
the original paper, a AS modulator utilizing the opamp has been reported in 
[192]. The most prominent shortcoming of this circuit is the long recovery time 
resulting from the need to charge and discharge the gate capacitances of M5, 
M6, and M8 every clock cycle with a constant bias current I B . The circuit has 
a minimum supply voltage of Vp + 3 Vd sa t an ^ it does not allow for the setting 
of the input CM level to Vss- 




Figure 10.8. Steyaert’s switchable opamp [34]. 




Figure 10.9. Fully differential switchable opamp [191]. 



6.1.2 Fully Differential Switchable Opamp 

The fully differential opamp [196, 191] shown in Figure 10.9 improves the 
previous circuit in various ways. To begin with, the minimum supply voltage 
has been squeezed to Vp + 2 Vdsat by folding the first stage, which also allows 
for the setting of the input CM voltage to Vss- 

The recovery time has been improved by minimizing the effect of switching 
to the opamp internal nodes. Thus, the first stage is not switched at all and the 
output stage is only disconnected from the Vss rail, since there is no need to 
disconnect it from Vp> p> to which the output is shorted anyway. The switching 
is done by cutting off the output transistor current with a series switch, which 
leads to faster recovery than shorting the transistor gate to Vdd » because the 
first stage output is not disturbed and the voltage change over the gate-source 
capacitance is smaller. The compensation capacitors are disconnected during 
the off-phase to avoid discharging them. 
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Figure 10.10. Common mode feedback circuit for the opamp shown in Figure 1 0.9. 



A fully differential opamp needs a common mode feedback circuit. In a two- 
stage opamp the common mode voltages at the outputs of both the stages need 
to be controlled. This is most often accomplished by enclosing both stages in a 
single control loop, as also done in this opamp. The CMFB signal is applied to 
the first stage via the gate node of nMOS current source transistors M3 and M4. 
For the common mode signal the opamp has two cascaded inverting stages, and 
thus a negative feedback requires one extra inversion in the feedback path. 

Since a series switch cannot be put in the output of the opamp, the traditional 
SC common mode feedback circuit [177] cannot be directly utilized in SO cir- 
cuits. The circuit shown in Figure 10.10 performs the common mode sensing 
with a capacitive divider consisting of capacitors Cp and Cyj. which are per- 
manently connected to the opamp outputs. The common mode voltage step is 
offset with the capacitor Cp>c- As a result, node nl is nominally equal to Vss 
in both clock phases. The signal inversion and the generation of pi open output 
DC level are performed by the feedback amplifier, consisting of a capacitor Cp 
and a single-ended opamp A1 . (The circuit is actually an SC integrator, but here 
its transfer function is merely of interest in the continuous time domain, hence 
the term amplifier.) The circuit can be used without problems in low voltage 
applications, since all the switches are operated against Vss or Vdd an d the 
virtual ground is set to V 55 . It is, however, difficult to achieve simultaneously 
fast and stable CMFB because of the extra inverting stage in the feedback loop. 

6.1.3 Class AB Switchable Opamp 

A switchable opamp (Figure 10.1 1) based on class AB operation was pro- 
posed in [197] and later utilized in a AS modulator [193]. The fully differential 
circuit can operate with aV T + 2 V dsat supply and the class AB structure pro- 
vides a moderate DC gain and high output driving capability. The common 
mode feedback is realized with a structure similar to the one in Figure 10.10, 
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Figure 10.11. Class AB switchable opamp [197]. 



except that the SC amplifier is replaced with a simple open loop transconductor. 
The feedback signal is applied to nodes n 1 and n2 in the form of two equal cur- 
rents. Due to the single gain stage opamp architecture and the simplifications 
in the CMFB circuit, the resulting common mode settling can be made substan- 
tially faster than in the previous circuit. The switching is realized with series 
switches between the opamp local supply rails and the chips global supply rails. 

The main disadvantage of the opamp is the fact that the circuit has practically 
only one gain stage, which, together with the large number of transistors, makes 
it noisy and sensitive to offset voltages. In addition, the low voltage current 
mirrors M5 and M 6 restrict the minimum input common mode voltage 2 Vp - 
Vdsat below Vdd- Consequently, there is a rather low maximum allowed supply 
voltage, which can be a limitation in applications requiring the capability of 
operating in a wide supply voltage rage. 

6.2 Proposed Opamps 

6.2.1 Opampl 

The first proposed switchable opamp has been developed for a pipelined 
ADC [ 8 ]. The circuit, shown in Figure 10.12, is based on a fully differential 
Miller opamp. To maximize the bandwidth and to minimize the needed supply 
voltage, the signal to the output stage is connected via the nMOS side. As a 
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Figure 10.12. Fully differential switchable opamp based on the Miller topology. 





Figure 10. 13. Common mode feedback circuit for the opamp shown in Figure 1 0. 1 2. 

result the allowed supply voltage is in the range from Vp +2 Vdsat to2Vr + Vd sa t • 

Only the output stage is switched. To prevent degeneration of the differential 
gain, the source nodes of M6 and M7 are actually connected together above the 
switches, although this is not shown in the schematic. 

Figure 10.13 shows the common mode feedback circuit, which is based 
on the one shown in Figure 10.10. The signal inversion, however, is realized 
differently — here with an open loop buffer instead of an opamp-based feedback 
circuit. 

The main advantages of this opamp are its wide bandwidth, rather low ther- 
mal noise, and simplicity. On the other hand, the circuit suffers from a slow 
common mode slew rate, which is partly a consequence of the compensation 
capacitors not being disconnected during the off-phase. The capacitors can- 
not be disconnected, since their input stage side terminal is at a potential of 
Vt + Vdsat , which does not permit the realization of a well-conducting switch. 



Figure 10.14. Switchable opamp with enhanced CMFB. 

As a result, when entering the on-phase, the common mode feedback, already 
not particularly fast, has to load the capacitors back to the nominal CM voltage, 
which slows it down considerably. Another drawback of this architecture is the 
limitation of the maximum supply voltage, which may be important in some 
applications. 

6.2.2 Opamp2 

The main problem with switchable opamps seems to be the recovery speed 
from the off state. In the next proposed opamp [7] the problem is tackled by 
increasing the bandwidth of the CMFB loop and by enhancing the common 
mode slew rate from the reset level toward the mid- supply level. 

The opamp architecture, shown in Figure 10.14, is basically the same two- 
stage structure with a folded first stage as in Figure 10.9. The main difference 
is in the design of the common mode feedback circuit, where the main goal 
has been to get rid of the CMFB loop enclosing both the opamp stages, plus 
an extra inverter stage. This can be accomplished by having a separate CMFB 
loop for each stage. The first stage loop, however, is difficult to implement, 
and making the two loops settle nicely at the same time can be tricky. Thus, 
the implementation is based on a different idea, which is to totally eliminate 
the need for CMFB in the first stage and realize the second stage CMFB with 
a simple passive SC circuit, yielding a fast and stable single pole CM settling 
behavior. 

The first stage is loaded, instead of simple current sources, with a structure 
commonly utilized in comparator preamplifiers [198]. It consists of the four 
pMOS transistors M8-M1 1, two of which (M10 and Mil) are connected as 
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diodes, while the other two are in parallel with them and have their gates cross- 
coupled to the complementary signal branch. Considering first a common mode 
signal, there is no difference between nodes n3 and n4, so they can be considered 
as shorted together. Consequently, the impedance seen from either of the nodes 
is approximately the parallel combination of the two transconductances, these 
being l/(g m s + g m 10 ) for node n3. This is a low impedance, which yields a 
stable enough common mode voltage without any feedback! 

When a differential signal is applied, the transconductance of the cross- 
coupled device appears with a negative sign and the equation, where the g ds 
terms can no longer be neglected, becomes 

r 3 = 1 • (101) 

gmio - gmS + gdsio + 9ds& 

Making the devices M8 and Ml 0 identical and assuming perfect matching, the 
g m terms in the equation cancel each other out, resulting in a high differential 
impedance, which is needed to achieve a high DC gain. 

The potential problem in this type of load is the possibility that, for some 
reason, the impedance will become negative, making the opamp unstable. Mis- 
match between M8 and M 10 is obviously one such reason, which is typically, 
however, small enough when the layout is drawn with care. Another, proba- 
bly more serious threat is the dynamic mismatch, which is due to the fact that 
the g m cancellation works only in small signal conditions. When the opamp 
is slewing, there can be a significant difference in the y m ’s, which makes the 
impedance in one branch momentarily negative, causing oscillations in settling, 
which makes it considerably longer. Thus, when a high differential slew rate is 
required, this opamp is probably not the best choice. Methods for alleviating 
the potential problems include sizing the cross-coupled devices slightly smaller 
than the diode devices or using source degeneration. The slewing behavior can 
be improved by increasing the bias current. 

For fast recovery, it is desirable that during the off-phase the fist stage output 
does not saturate in the presence of input offset, which is avoided by shorting 
nodes nl and n2. 

The common mode slew rate has to be high only in one direction, from Vdd 
to the mid-supply level in this circuit. The slew rate is maximized by having 
an active pull-down, realized with Ml 2 and M13 and the CMFB, instead of a 
pull-down with a constant current. 

Disconnecting the compensation capacitors in the off-phase serves two pur- 
poses. It enhances the common mode slew rate, since when connected back 
again the capacitors pull the nodes nl and n2 up at the beginning of the set- 
tling. This works in the same direction as the common mode feedback and has 
a boosting effect on it. Furthermore, if the consecutive output signal values 




Figure 10.15 . CMFB for the opamp in Figure 10. 14. 

do not differ much, preserving the charge in the capacitors also reduces the 
differential settling time. 

The CMFB circuit is shown in Figure 10.15. The sensing circuit, consisting 
of the capacitors Cp, Cjyj, Cp)£, and the attached switches, is again similar to 
the one in Figure 10.10. Here signal inversion is not needed, only level shifting 
to a proper level, which is realized with the capacitor The capacitor is 
precharged to the bias voltage V#, which is generated with a bias current and 
a scaled-down replica of the opamp output current source. The voltage Vb 
is approximately equal to Vr,n + Vdsau which is a high enough potential for 
a pMOS switch. The switch, however, has a rather high on-resistance as a 
result of the minimal overdrive, but it is not a problem, since, once reached, the 
purpose of the switch is only to maintain a constant precharge level in Cpg. 

What makes the realization of the CMFB more complicated than it first looks 
is the fact that, due to the parasitic capacitances in the opamp output stage de- 
vices M 1 2 and M 1 3, setting the output CM level correctly is not straightforward. 
This is illustrated in Figure 10. 16, where the voltages at the transistor terminals 
in both clock phases are shown. The voltage change disturbs the common mode 
feedback because of feedthrough via the parasitic gate-source and gate-drain 
capacitances, resulting in too high an output common mode level. The problem 
is reduced to a tolerable level by making the capacitors Cp and Cjyj large in 
comparison to the parasitics. Other possibilities include isolating the transistor 
gate in the off-phase with an additional series switch, using a dummy struc- 
ture to compensate for the feedthrough, or buffering the CMFB signal with a 
continuous time buffer such as the one employed in opamp 1. 

The proposed circuit has since been used in a switched opamp AE modulator 
by Sauerbrey and Thewes [199], and a modified version of the circuit, where 
the cross-coupled load is in nMOS side and the output stage uses nMOS gain 
devices, in another low voltage AE modulator by Dessouky and Kaiser [143]. 
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Figure 10. 16. DC voltages at terminals of transistor M12 and its parasitic capacitances. 
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Figure 10.17. Fully differential switchable opamp. 



6.2,3 Opamp3 

The third switchable opamp [3] is targeted to a pipelined ADC, where wide 
bandwidth, high slew rate, and relatively large DC gain are essential. In con- 
trast, because of the small feedback factor in SO MDACs, the phase margin 
requirement is not critical. Consequently, nMOS gain devices are utilized in 
both stages, yielding the architecture shown in Figure 10.17, which is practi- 
cally an nMOS version of the opamp2 with a conventional current source load. 
The minimum supply voltage is again Vp + 3 Vdsat' 

The achievable DC gain is larger than in the previous two opamps utilizing a 
cascode first stage, since here the transconductance in both the stages is realized 
with nMOS devices. In addition, the extra Vdsat in the supply voltage, compared 
to the opamp shown in Figure 10.9, is over the transistors M8 and M9 — -just 
where it is most critically needed to improve first stage output impedance. 

As discussed in Chapter 7, the combination of pMOS cascode devices and 
nMOS transconductances in the output stage is bad from the point of view of 



Figure 10.18. Alternative method to feed the CMFB signal to the first stage. 

fast settling. When not aiming at unity gain stability, the situation is, however, 
manageable. Although the criterion (7.7) is not totally met, no peaking is 
observed, which is probably due to the stabilizing effect of the intrinsic Miller 
capacitance, formed by the gate-drain capacitances of devices M14 and M15. 

The main reason for returning to a normal current source load is the require- 
ment for a high differential slew rate, which is not easily obtained with opamp2 
for the reasons given. 

If the compensation capacitors were disconnected in the off-phase, the signal 
value from the previous clock period would remain on them. Since, in a pipeline 
ADC, voltage changes in opamp output can be from the smallest negative to 
the largest positive, this memory effect would increase the settling time in the 
worst cases. Thus, the capacitors are not disconnected, which also improves the 
phase response, since without the switches there is no series resistance with the 
capacitors. The disadvantage of not disconnecting the capacitors, as in opamp 1 , 
is the increased demand for downward common mode slewing current in the 
opamp first stage. 

The novel features of this circuit are again in the common mode feedback. 
The feedback signal is applied in the first stage, but for maximum speed and 
stability the number of extra nodes in the CMFB loop is minimized. Because of 
the need for signal inversion the sensed CM signal cannot be directly connected 
to the gate of any transistors in the first stage. Thus, the extra pMOS devices 
M10 and Ml 1 are added to the cascode nodes and their currents are controlled 
with the CMFB signal. Alternatively, the same can be achieved with a single 
nMOS device connected to the node nO, as shown in Figure 10.18. The latter 
offers easier CM biasing (CMFB node biased nominally to Vdd) and less 
noise, but offers only limited slewing current for charging the compensation 
capacitors in contrast to the solution employed, which sinks a large downward 
current from the cascode nodes. 

The common mode sensing is realized with the same structure as earlier 
(Figure 10.19); now, however, nothing else is needed. The capacitors are re- 
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Figure 10.20. Dynamic CMFB boosting circuit. 



set, instead of Vss, against voltage Vc which is equal to the bias voltage for 
the cascode devices in the opamp. The voltage is low enough for the proper 
operation of an nMOS switch. 

Returning the output common mode level to Vdd/2 from the reset voltage 
is a task to be performed every clock cycle. Since it is deterministic in nature, 
realizing it does not need to be totally based on feedback. Thus, the output stage 
current sources are dynamically biased with the structure shown in Figure 10.20. 
It uses a switched capacitor to push the gate bias downward at the beginning of 
the on-phase. This introduces a current pulse to the output stage, facilitating the 
task of the common mode feedback circuit. The size of the switched capacitor 
and the current I B can be used to adjust the magnitude and the duration of the 
current pulse. 



Table 10.1. Switchable opamps compared. 



Baschirotto Peluso Opampl Opamp2 Opamp3 



Min supply 
Max supply 
GBW 

Phase margin 
DC gain 
Thermal noise 
Slew rate 
CMFB 



Ft + 2 Vdsat 

not limited* 

+ 

+ 

+ 



Vt + 2F dsai 
2 Fr + Vdsat 

+ 

+ 



++ 

+ 



Vt + 2 Vdsat. 
2 Vt + Vdsat 
+ 

+ 

+ 

+ 



Vt + 3 Vdsat 

not limited 

+ 

+ 



++ 



Vt + 3 Vdsat 

not limited 
+ 

++ 

+ 

++ 



* requires changing the cascode biasing from that presented, adding one Vdsat to the 
minimum supply. 



margin by using the complementary topology (all nMOS devices changed to 
pMOS and vice versa). What can be seen is the fact that all the topologies have 
their strengths and weaknesses, making the choice of the opamp architecture 
application specific. 

7. Input Interfaces for SO Circuits 

Switching the opamp is a way to get rid of the series switches connected 
to the opamp output. Typically, however, there are also series switches in the 
circuit input, which are not eliminated by the technique. 

In Steyaert’s and Crols’ SO circuit [34], which was a low pass biquad, two 
approaches were tested. The first one was a switch controlled with a voltage 
higher than the supply, which is, however, not a true low- voltage technique. The 
same, using a long-term-reliable bootstrapped switch, has later been proposed 
in [134]. The second solution was to implement the first resistor, normally 
realized with a switched capacitor, as a real resistor. This technique can be 
used in some SO filters, but it is not applicable to all SO circuits. 

Reduced swing input signal in conjunction with a series switch was used in 
the AS modulators presented in [192] and [193], This is not a true low-voltage 
technique either, since when the supply voltage is reduced to a level where all 
switches are still operational the available input signal swing is zero. 

Bandpass circuits often have a non-switched capacitor in their input, and thus 
they can be implemented in the SO technique without problems [191, 200], 



6.3 Switchable Opamps: Comparison 

Table 10.1 shows a comparison of the presented switchable opamp topolo- 
gies. It should be noted that in most cases the GBW can be traded with the phase 



7.1 Active Input Structures 

In Nyquist rate A/D converters DC decoupling is out of the question and 
reduced signal range is unacceptable because of thermal noise and increased 
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Figure 10.21. Input structure based on feedback amplifier. 



accuracy requirements for the comparators. At the time when author’s first SO 
pipeline ADC [8] was designed, no suitable solutions for the input circuitry 
existed. 

The structure employed, shown in Figure 10.21, is based on a continuous 
time feedback amplifier. It makes possible the moving of the series switch from 
the circuit input to the virtual ground created by the opamp, making low- voltage 
operation possible. Recently, similar techniques for bringing the opamp input 
near supply rails have been investigated in conjunction with continuous time 
low -voltage circuits [201, 202]. 

The opamp is switchable, looking like just another switchable opamp to 
the following circuitry. In the on-state the DC voltage level at the amplifier 
output is Vdd/ 2 and the node nO is near the ground level. Biasing the virtual 
ground to a level different from the output level is accomplished with a level- 
shifter (voltage source V 0 s) and the extra resistor R 3 . Instead of the resistor, a 
current source can be used, which avoids the degrading of the opamp gain but 
does not permit the biasing of nO as low as the resistor and makes the biasing 
less robust. Furthermore, the current source adds more parasitic capacitance 
than the resistor. 

The lower the voltage at nO, the better the switches conduct. In contrast, low 
nO requires small R 3 , which degrades the effective gain of the opamp, as seen 
from the transfer function, which is given by 



Vqut_ 

VlN 



R2 1 

Rl l + T( 1 + f + t) 



( 10 . 2 ) 



The circuit is intended to allow signal frequencies in the megahertz range. 
Thus, the bandwidth of the opamp is more important than the DC gain. Con- 
sequently, the opamp is a simple inverter, shown in Figure 10.22. It operates 
in class AB, which is achieved via dynamic biasing, realized with switched 
capacitors, which also form the offset voltage source (Vq^ in Figure 10.21). 




Figure 10.22. Implementation of the amplifier and the voltage source of Figure 10.21 . 




r OUT 



Figure 10.23. Another active input structure [203]. 



The capacitors Ci and C 2 are refreshed during the off-phase by connecting their 
right terminals to the gate bias voltages of the transistors. The bias voltage on 
resistor R 3 , which is also the second bias voltage for the capacitors, is produced 
with a switchable current source Ig. 

The main drawback of this input structure is its limited linearity, which is 
due to inadequate opamp gain at the signal frequency and the signal dependent 
on-resistances of the series switches, which are not negligible compared to the 
resistor values. 

While the ADC prototype was still in the process of fabrication, a similar 
type of input structure was proposed by Baschirotto et al. in [203]. The circuit 
is shown in Figure 10.23. It is targeted on lower signal frequencies, allowing 
larger resistor values, which makes it possible to leave out the series switches 
and simply ground node nO in the off-phase without a fear of input signal 
feedthrough. As a result, nO can be biased to Vdd/ 2 and no extra resistor or 
current source is needed. The opamp utilized is a two-stage structure providing 
good low frequency linearity. This circuit also suffers from limited amplifier 
bandwidth, which degrades the high frequency linearity. 
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Input interface First SO stage Clock phases 



Figure 10.24. Passive input interface. 



Input structures utilizing a transimpedance amplifier instead of an opamp 
have recently been studied in [204], These circuits have potential for somewhat 
larger bandwidths than opamp-based circuits. 

7.2 Passive Input Interface 

The input structures presented are based on the utilization of an active circuit 
block, typically an opamp, whose finite bandwidth limits their linearity at high 
signal frequencies. Hence, in high frequency applications a circuit without 
an opamp would be attractive. Such a circuit was developed for the second 
ADC prototype (in the end of Chapter 12) [3]. It is based on the idea that a 
DC decoupled signal can be brought into SO circuits without problems, and 
since the opamp input is purely capacitive, DC decoupling only leads to the 
loss of signal DC value. Even this can be avoided, if the DC voltage on the 
coupling capacitor is known. Controlling the voltage on the capacitor can be 
accomplished by resetting it every clock cycle. 

A circuit realizing this idea, together with the required clock signals, is shown 
in Figure 10.24. The input interface comprises resistor R, coupling capacitor 
C, and switch transistors M1-M3. The first SO stage is partially shown on the 
right of the input structure. Figure 10.25 shows simulated waveforms obtained 
with a 1.7-MHz signal at a 5-MHz clock rate. 

The circuit uses three clock phases, the hold phase lasting a half clock period, 
while the reset and the sample phase are each a quarter period long. During the 
reset phase (</> = 0, r/=l, r2- 1) the voltage on capacitor C is reset with shunt 
switch M3. To allow the operation of the n-type reset switch node nl is shorted 
to the ground and node n2 is left floating. 

In the sampling phase (<j> = 0, rl- 0, r2=0) the input voltage is sampled in the 
series combination of the coupling capacitor C and the sampling capacitor C§. 
Making C large compared to C s results in the majority of the signal appearing 
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Figure 10.25. Simulated voltages in the circuit shown in Figure 10.24. 



across C$. The remaining attenuation can be compensated for by properly 
adjusting the ratio of C$ and Cp in the first SO stage by setting 



Cs = C's 



c + c 2 

C -C's' 



where C' s is the nominal value of C$ and C 2 the parasitic capacitance at node 
n2. The remaining gain error resulting from the uncertainty of C 2 is small 
enough for most applications. 

In the hold phase (<f> = 1), which is also the on-phase of the switchable 
opamp, both nodes nl and n2 are shorted to the ground and the reset switch 
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is open so as to improve isolation against input signal feedthrough. To the 
first switched opamp stage the input circuitry looks just like another switchable 
opamp, when being in the off-state. In this phase the charge sampled in is 
transferred to the feedback capacitor Cp. 

The size of resistor R is set by two constraints. First, it should be large 
in order to minimize the fractional signal voltage seen at nl, which results 
from resistive division between R and Ml’s on-resistance and causes signal 
feedthrough in the hold phase. For the same reason. Ml and M2 should be 
wide devices. Their size, however, cannot be made arbitrarily large, since their 
nonlinear parasitic capacitances are a source of harmonic distortion. On the 
high side the size of R is limited by the sampling time constant. The size of 
C is limited by the available area and its bottom plate parasitic capacitance, as 
well as the time constant associated with the resetting. 

There is one potential problem in the circuit, which can, however, be avoided 
by one additional switch; when the capacitor is being reset nodes nl and n2 
are shorted and as a result an attenuated version of the input signal is seen at 
node n2: V 2 r = Vjn * Roni/(Roni + R)- At the end of the reset phase this 
voltage is sampled in C$, introducing an error to the signal voltage, which will 
be sampled in the next phase. The error is equal to 



V‘2S = V 2 R ■ 



Cs + C 2 
C + Cs + C2 



(10.4) 



The effect on the frequency response can be modeled with a two-tap FIR struc- 
ture. If the effect is intolerable, which is unlikely, it can easily be almost totally 
eliminated by adding a series switch on the opamp side of C$, which enables 
it to be disconnected during the reset phase. 

The input structure does not significantly increase thermal noise, since the 
noise sampled in the sampling phase is the normal kT / Cs and the additional 
noise sampled in the reset phase is determined by the total capacitance, including 
the large capacitor C. 

The clock signals with 25% and 75% duty cycles are most easily realized 
by using a double rate clock signal, from which all the necessary clocks are 
generated. 
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OTHER LOW- VOLTAGE TECHNIQUES 



1* Low Voltage SC Technique with Unity-Gain-Reset 
Opamps 

The main factor which makes switched opamp circuits slower than traditional 
SC circuits is the time it takes to wake up the opamp from the off- state. Opamp 
switching is needed to make possible the shorting of the output node to ground, 
which is in turn needed by the charge transfer occurring in the following stage. 
The charge transfer, however, does not require the output reset level to be either 
of the supply levels; it can be any constant voltage. 

The idea proposed in [205] uses this by connecting the opamp into unity gain 
feedback instead of turning it off and shorting the output to ground. Conse- 
quently, the opamp output settles to the virtual ground level in the reset phase. 
The principle of this technique is illustrated in Figure 11.1, where two cascaded 
integrators are shown. 




Figure ll.L SC circuit based on unity-gain-reset opamps. 
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Figure 1 1.2. Floating voltage source in the feedback loop prevents the charge leakage from 
integration capacitor. 



Originally it was proposed to set the reference level Vref equal to Vss, 
which^permits a supply voltage as low as in the switched opamp technique. 
This however, introduces two problems. First, the switch on the left side ot 
the integration capacitor C 3 is an nMOS transistor, whose junction diode easily 
becomes forward biased when the signal voltage on the capacitor pushes the 
node between the switch and the capacitor down in the reset phase. Second, 
driving the opamp output all the way down to Vss in the reset phase pushes its 
output stage transistors out of saturation, resulting in a recovery time which is 
not much better than in a well-designed switchable opamp. 

In [206] a fully differential circuit was designed using Vref of 500 mV, 
which does not permit as low a supply voltage, but removes both the problems 
mentioned. Furthermore, letting the output CM level be the same 500 mV also 
in the integration phase removes the need for extra DC correcting capacitois. 

The original paper [205] proposes adding a floating voltage source in the unity 
gain feedback loop so as to prevent the charge leakage without increasing the 
supply voltage. The resulting circuit is shown in Figure 1 1.2. Now the output 
is reset to V DD instead of V S s, which pulls the node behind the integration 
capacitors up when entering the reset phase, and thus no leakage can occur. 
The opamp output is still driven out of saturation, which can be avoided by 
making the voltage source somewhat smaller than V DD - The voltage source 
can easily be realized with a switched capacitor, as shown in the paper. The 
same authors have demonstrated the feasibility of the technique via the design 
of a AS modulator reported in [207]. 

Different CM levels in the reset and integration phase still require an extra 
switched capacitor, just like SO circuits, for adjusting the CM level. Therefore, I 
propose here that the voltage source should be Vdd/ 2, which removes the need 
for the capacitor while offering maximal voltage swing with minimal supply. 
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Figure 11.3. Low voltage current mirror. 



The implementation of a voltage source is, however, not easily done with a 
switched capacitor, but can be realized with a resistive level shifter [208]. 

The non-switched opamp makes the technique potentially faster than the SO 
one. However, the capacitors are still permanently connected to the opamp 
outputs and the opamp is not available for signal processing during one half 
of the clock period. Thus, the technique still features some speed penalties in 
comparison to SC circuits. Further, the settling of a single stage involves two 
opamps, making it potentially longer. The resetting also requires the opamp to 
be unity gain stable with a good phase margin, which is not generally required 
in SO and SC circuits. 

2. Current Sources and Mirrors 

The traditional current mirror, either a simple topology or a cascoded struc- 
ture, sinks the current through a diode-connected transistor. As a result the 
input voltage is Vr + VdsaU which does not leave much room for other circuit 
structures in low voltage realizations. The low voltage current mirror, shown in 
Figure 1 1 .3, removes this limitation by using the cascode node as the input for 
the signal current; only the DC bias current (J 0 ) is supplied in the traditional 
way. Thus, the minimum input voltage is now only Vdsat- 

Since the signal current does not flow through the cascode transistor M3, the 
voltage variation at node n2 is very small (V n \ divided by the gain of M3), i.e. 
the circuit has a very low input impedance. This property can be exploited to 
realize a linear voltage-to-current converter by placing a series resistor in the 
input of the current mirror [209]. 

Even a lower input, as well as output, voltage can be realized by biasing the 
transistors Ml and M2 in the triode region instead of saturation. Then, however, 
the output impedance becomes so low that the circuit no longer acts as a current 
source (or mirror). This can be corrected by adding a feedback loop, which 
makes the voltage at node n2 track the output voltage. Two realizations are 
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Figure 11.4. Triode region current source — two implementations. 



Figure 11.5. Low voltage bandgap reference. 



shown in Figure 1 1.4. The one on the left [210] controls the gate voltage of the 
cascode transistor M3 with a level shifter constructed with a diode-connected 
transistor M4, which is matched with M3. The circuit on the right [211] uses 
an opamp to form the feedback loop. Another similar type of circuit is reported 
in [212]. 

In the literature [2 1 1 , 2 1 2] the triode region current source has been employed 
as a tail current source in opamps. In principle it can be utilized in the output 
stage as well, but the local feedback loop has an effect on the opamp frequency 
response, requiring at least careful settling analysis and simulations. 

3. Bandgap References 

Like many other analog circuits, ADCs need a reference voltage, which is 
used for determining the quantization levels. In experimental prototype cir- 
cuits the reference voltage is often supplied externally, while many commercial 
devices have an on-chip voltage reference, which is more convenient for the 
end user. The on-chip reference is typically realized with a bandgap reference 
(BGR) circuit, which provides a stable reference voltage over a wide tempera- 
ture range. 

3.1 Low Voltage BGR Circuits 

The operation of bandgap reference is based on the fact that the base-emitter 
voltage (Vbe) of a bipolar transistor has a negative temperature coefficient, 
while in the voltage difference of two base-emitter junctions (A VbeX biased 
with different current densities, the temperature dependency is positive. Thus, 
a properly weighted sum of them ( Vbe + K ' A Vbe) is f ree °f temperature de- 
pendency. In traditional BGR circuits the sum is formed in the voltage domain, 
resulting in a reference voltage around 1 .25 V, which is clearly an obstacle to 
low voltage operation. 



A lower supply can be used if currents are summed instead of voltages. Then 
the required minimum supply voltage is the voltage of a forward-biased base- 
emitter junction plus a headroom for a current source. Thus, the circuit can be 
realized with a supply voltage around 0.9 V. Such circuits have been reported 
in [213, 214,215,216]. 

The low voltage BGR proposed in [214] is shown in Figure 1 1.5. The feed- 
back loop, consisting of an opamp and a pair of matched controlled current 
sources, forces the voltages vl and v2 to be equal. Consequently, the current 
through the resistor R1 is proportional to Veb and the current through the resis- 
tor R3 to the difference of the emitter-base voltages of the two pnp transistors. 
Setting the resistor R2 equal to R1 makes their currents the same. Since the 
current of the controlled source is the sum of currents through R2 and R3, it 
will be proportional to Veb + K * A Veb> which is exactly what is required 
from a temperature-independent reference. The current generated is mirrored 
through the resistor R4, producing the reference voltage across it. 

The biggest problem in this circuit is the realization of the opamp input 
stage. With the temperature, the emitter-base voltage (vl in the circuit) goes to 
about 500 mV at the lowest and above 800 mV at the highest. Thus, without 
increasing the supply voltage, there is not enough room for a MOS transistor 
gate-source voltage between vl and either of the supply voltages. In the future, 
when the MOS threshold voltage is scaled down close to 300 mV, the problem 
will disappear, which will also be the case when the process offers low Vp MOS 
transistors. In a BiCMOS technology, similar npn transistors which are used for 
generating the bandgap voltage can be used as the opamp input devices [216]. 

Two solutions for a standard CMOS technology have been proposed. One is 
to replace the opamp with a transimpedance amplifier and connect the resistors 
R1 and R2 to its inputs instead of the ground [215]. And the second, proposed 
by the author [13], is shown in Figure 11.6. There, the opamp uses a pMOS 
input pair and its inputs are connected to intermediate taps of resistors R1 and 
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Figure 1 1.6. Proposed low-voltage BGR circuit with startup. 



R2. The tap voltages v3 and v4 , which are in a 1:3 proportion to the voltages 
vl and v2, are low enough for the pMOS input pair. 

The current sources are cascoded to increase output current accuracy. This, 
unfortunately, eats the precious voltage headroom needed for low noise and low 
offset biasing. With low Vr transistors it would be possible to bias the output 
cascode in such a way that the voltage at node n7 tracks vl, which would make 
the cascodes in the other two current sources redundant. 

The BGR circuit has two stable operation points: the desired one and the 
other where the current is zero, i.e. voltages vl and v2 are both zero. To ensure 
that the circuit always ends up in the correct operation point a startup circuit is 
included. There, the resistor R<g is used to produce a current which is injected 
into node nl if the voltage vl goes below one threshold voltage of an nMOS 
transistor. In the desired operation point the voltage vl is above the threshold 
and thus the startup circuit has no effect on the BGR circuit. Since the opamp 
is biased from the BGR, the startup circuit also ensures its bias current. 

According to simulations [13], the circuit can be used with a supply voltage 
ranging from 0.95 V to 1 .50 V and at temperatures ranging from -20 to +100° C. 

3.2 Reference Voltage Driver 

The voltage provided by the bandgap reference is generated across a resis- 
tor, and thus it is not suitable for supplying a switched capacitor load without 
buffering. Typically, ADCs (pipelined or delta-sigma) based on the switched 
capacitor or switched opamp techniques utilize fully differential circuitry, which 
demands the reference voltage to be differential, i.e. a difference of two volt- 
ages set symmetrically between the supply rails. The circuitry in these ADCs 
operates in two phases, each lasting half of the clock cycle. Consequently, the 
capacitors have to be loaded to the reference voltages in half a clock period. 

The proposed driver circuit is shown on the left in Figure 11.7. There, the 
reference current I ref is supplied by the bandgap reference, which is the circuit 
shown in Figure 11.6 without the resistor R4. The current is mirrored with 
transistor M2 to go through a floating resistor Rl, matched with the resistors 
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Figure 1 1.7. Differential reference voltage driver (left) and unity gain buffer A 1 (right). 

in the bandgap reference circuit. As a result, the differential reference voltage 
appears across the resistor Rl. The common mode voltage level is controlled 
by adjusting the current of Ml with an opamp and feedback loop. The bias 
voltage Vr is generated with a replica circuit. To improve the accuracy of the 
current mirroring, the voltage Vo is adjusted in such a way that the voltage of 
node n3 tracks node n2. 

The generated reference voltages at nodes nl and n2 are buffered with the 
unity gain buffers A1 and A2. The schematic of the buffer A 1 is shown on the 
right in Figure 11.7. It is an amplifier, which consists of a differential pair and a 
low voltage current mirror load, connected in unity gain feedback. The cascode 
transistor is biased in such a manner that node nl tracks the input voltage in 
order to minimize the systematic offset resulting from the amplifier imbalance. 
The buffer A2 is similar to Al, except that all nMOS transistors are replaced 
with pMOS devices and vice versa. 
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PROTOTYPES AND EXPERIMENTAL RESULTS 



1. Measurement Setups and Methods 

1.1 Measuring Dynamic Performance of S/H Circuits 

The target application of an S/H circuit has a large impact on measuring the 
circuit. If the circuit is designed to be used with an ADC integrated on the same 
chip, no capability of driving an external 50-SI load is needed. This, however, 
prevents straightforward full-speed measurements. On the other hand, in ADC 
applications, only the instantaneous value of the S/H circuit output at the end 
of the hold phase is of interest. Consequently, continuous time measurements 
may give results that are too pessimistic. 

The easiest way to get rid of these problems is to characterize the S/H circuit 
together with the ADC. Then, however, it may be difficult to distinguish between 
the properties of the ADC and the S/H circuit. A 50-fJ driving capability can 
be obtained with an on-chip buffer, yet its implementation is often even more 
demanding than the design of the S/H circuit itself. So the designer can very 
easily end up in a situation where he or she is measuring the output buffer rather 
than the S/H circuit. 

A widely-used way to characterize S/H circuits is the beat frequency test 
[351. There, two S/H circuits are integrated on the same chip and one is used 
to measure the other. The measurement setup used to characterize the imple- 
mented circuits is shown in Figure 12.1. In this, the output of the first S/H 
circuit (the one on the left) is sub-sampled with the second circuit, whose clock 
signal is obtained by dividing the clock of the first circuit with an on-chip di- 
vider by some integer N. Now, if the input signal frequency is within a small 
offset, say A/, of the clock frequency of the second circuit (/s//V), the signal 
is aliased to a low frequency, which is equal to the frequency offset A/. In a 
similar manner the possible harmonics in the first circuit output become aliased 
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Figure 12. 1. Measurement setup for beat frequency test. 



to the frequencies 2A /, 3A /, 4A/, and so on, as illustrated in the inset of 
Figure 12.1. 

The beat frequency test provides a way to investigate the distortion charac- 
teristics of a S/H circuit at high signal frequencies without the need to bring 
high frequency signals out from the chip. The second S/H circuit does not 
even need to drive the 5042 input impedance of the measurement equipment or 
balun, since the low frequency signal can be handled with an active differential- 
to-single-ended converter constructed of discrete opamps. 

Since the second S/H circuit samples the fully settled output of the first one, 
the measurement setup simulates the actual operation environment in front of an 
ADC on the same chip. Although the output of the second circuit is measured 
in continuous time, the error introduced is insignificant as a result of the fact 
that the difference between the concurrent sample values is very small. Also, 
the sine attenuation resulting from the hold operation can be ignored at low 
frequencies. 

1.2 ADC Measurements 

The methods for measuring ADC performance and the related figures of 
merit are outlined in two IEEE standards. The more recent of them, IEEE- 
STD-1241 [217], is specially targeted on ADCs, and thus it virtually replaces 
the earlier one [218] for waveform recorders. 

1.2.1 Static Linearity 

The quantization levels of an ADC can be measured using a servo loop. 
The linearity errors, DNL and INL, are calculated from the measured levels. 
Alternatively, a histogram-based method can be used to reduce the complexity 
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of the measurement setup. In this, a signal with a known waveform is applied 
to the ADC input and the resultant output codes are collected into bins, each of 
which corresponds to one possible ADC output code. The number of samples 
that fall into a code bin represents the bin width, while the ideal bin widths can 
be derived with the knowledge of the signal waveform. The difference between 
these two is used to calculate the DNL and INL. This so-called code density test 
can be performed at full speed and with a sufficiently high signal frequency. 

The histogram of a triangular or sawtooth wave is flat, making the calculations 
easy. Generating either type of signal with adequate purity, however, is difficult. 
Thus, a sinusoidal signal is often preferred, since it can readily be generated 
with a general signal source and proper filtering. Furthermore, the signal’s 
purity can be checked with a spectrum analyzer. 

Properly selecting the signal frequency with respect to the clock frequency is 
important in order to guarantee that the code density is not affected by unwanted 
correlation between the signal and the clock frequency. The required length of 
the data record is set by the noise and desired tolerance and confidence levels. 
A more detailed description of the code density test can be found from the 
standard [217] and an earlier publication [219]. The development history of the 
method can be followed with references [220, 221 , 222, 223], 

1.2.2 Signal to Noise and Distortion Ratio 

The standard provides two methods for determining the SNDR, a frequency 
domain method based on DFT (discrete Fourier transform) and a time domain 
method using curve fitting. In the latter, the recorded sine wave is fitted to an 
ideal sine wave by minimizing the mean square error. The difference between 
the curves includes the ideal quantization error as well as the effect of static 
and dynamic ADC errors and noise. Thus, the SNDR can be calculated. This 
is the method used in measuring the prototypes described later in this chapter. 

The alternative method extracts the error energy from the spectrum obtained 
with the DFT. The signal energy is in one frequency bin, while the error energy is 
distributed to the others, with the exception of the zero frequency bin, which also 
contains the DC term. The method requires the signal energy to be contained 
exactly in one bin, which is obtained when the record has an integer number of 
signal cycles. The sine fit method does not have this restriction. 

The SFDR and THD can be determined from the DFT spectrum. 

2. S/H Circuit Using Double-Sampling 

The goal of this design is to develop a high-speed CMOS S/H circuit for 
time-interleaved ADCs. The target specifications were set as follows: 10-bit 
resolution, a sampling rate higher than 100 MS/s, 2-Vpp differential signal 
swing from a 3.0-volt supply, and reasonable power consumption. This proto- 
type has been published in [12, 10, 1 1], 









164 CIRCUIT TECHNIQUES FOR LOW-VOLTAGE AND HIGH-SPEED ADCS 




Figure 12.2. Fully differential double-sampled S/H circuit. 



2.1 Architecture 

The architecture of the prototype is shown in Figure 12.2. It is a fully dif- 
ferential version of the double-sampled S/H circuit shown in Figure 9.3. The 
differential structure is almost a necessity in the mixed signal environment of 
ADCs, where the amount of substrate noise and other disturbances is consider- 

a ble. . , 

The signal common mode level at the input and the output of the circuit need 

not be the same; neither do the common mode level at the input and the output of 
the opamp. This can be utilized to adjust the level of the continuous time input 
signal in the region where the distortion caused by the signal-dependent switch 
on-resistance is minimized. In the case of nMOS switches, the input signal 
level should be as small as possible. There are, however, two reasons which 
set a lower limit for the input signal common mode level. First, the negative 
peak voltage may not go much below Vss. < n order t0 prevent the pn-junctions 
in the drain and the source of the MOS switch from becoming forward-biased. 
On the other hand, if the S/H circuit is driven without DC decoupling the driver 




circuit probably cannot provide a signal swing that ranges down to Vss- The 
signal levels used in this design are shown in Figure 12.3. 

The offset between the opamp input and output common mode level in hold 
mode is the same as the difference between the input signal CM voltage and 
the sampling ground. The maximum tolerable offset is heavily dependent on 
the type of opamp. From the sampling switch point of view, it is preferable to 
make the sampling ground voltage as low as possible in order to reduce switch 
size. For maximum signal swing and minimum distortion the output common 
mode level is set half-way between the supply voltages. The circuit employs the 
bottom plate sampling technique to avoid a signal-dependent charge injection 
from the MOS switches. 

2.2 Switches 

The switches throughout the design are implemented with nMOS transistors. 
The distortion resulting from the signal-dependent switch time constant is sup- 
pressed below the target level by controlling the switches with a voltage higher 
than the 3-volt supply. This voltage is not generated on the chip; instead, an 
external voltage source is used. 

Although bottom plate sampling minimizes the signal-dependent charge in- 
jection, the constant common mode voltage step resulting from the injected 
charge and clock feed through can still be a problem, since it may cause the 
common mode level in the opamp input to exceed the valid range. By making 
the switches Si and S 8 equal in size, the problem can be minimized, thanks to 
the fact that the switches operate in opposite clock phases, and thus their charge 
injections cancel each other. 
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Figure 12.4. CMFB circuit for double-sampling circuit. 



2.3 Clock Generator 

Avoiding systematic timing skew in the double-sampled S/H circuit is es- 
sential. Thus, so as to guarantee an exact 180° phase difference in the half-rate 
clocks, the input for the clock generator is derived from the incoming full-rate 
clock using a synchronous divide-by-two circuit, which is built with a differen- 
tial D-flipflop [224], The clock generator relies on a standard structure, based 
on cross-coupled OR-gates, in producing the non-overlapping clock phases. 
The last clock buffer stages, which provide the switch control voltage, use the 
high supply voltage. 

2.4 Opamp 

The opamp architecture, which has already been studied in Chapter 7 , is based 
on a cascode output stage and low-gain first stage. Since it is fully differential, 
it requires a common mode feedback circuit. Due to the double- sampling, the 
common mode feedback has to be active in both the clock phases. Such a 
feature is easily implemented with a continuous-time CMFB circuit. In this 
design, however, two parallel switched capacitor CMFB circuits are operated 
in opposite clock phases. The circuit is shown in Figure 12.4. 

The simulated opamp frequency response shows a 450-MHz GBW and 62- 
degree phase margin at the unity gain frequency and about a 70-degree one at 
the frequency of closed loop gain in the target feedback configuration. The DC 
gain simulated with the nominal transistor parameters is 62 dB and the settling 
time to 10-bit accuracy 4.5 ns. 

2.5 Experimental Results 

The circuit was fabricated with a 0.5-//, m double-poly triple-metal CMOS 
process. A photograph of the prototype chip, containing two S/H circuits and 
a programmable divider, is shown in Figure 12.5. 

The circuit is characterized with the beat frequency test with several sub- 
sampling ratios. A spectrum of a73.3-MHz, 1 ,75-Vpp signal, which is sampled 
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Figure 12.5. A photograph of the double-sampling S/H prototype. 




at 220 MS/s, is shown in Figure 1 2.6. The SFDR is limited by the third harmonic, 
which in this case is at the 65.6-dBc level. 

In addition to the signal and its harmonics, there is an extra spurious frequency 
at the 25-kHz offset from the signal peak. It is not generated by the S/H 
circuit under test. The same spur (and also a number of its multiples) is seen 
in the spectrum of the signal generator, which is used as the clock source. 
Thus, it probably originates from the leakage of the PLL reference in the signal 
generator. 

The SFDR is measured as a function of the signal amplitude at 130 and 
220-MS/s sampling rates. The results are shown in Figures 12.7 and 12.8 
respectively. There are two curves in both the figures, a solid curve for a 200- 
kHz input signal and a dashed curve for an input signal which is within a small 
frequency offset of one third of the clock frequency. The results show that the 
circuit operates well at both clock rates. As expected, the SFDR decreases as 
the signal amplitude is increased. At 220 MS/s 10-bit resolution is achieved 
with a 1.8-Vpp signal from DC up to one third of the clock frequency. 

Limitations in the measurement equipment prevented the testing of the circuit 
at clock rates higher than 220 MS/s. Neither was it possible to measure the 
circuit with a fs/2 input signal at 220 MS/s because of the lack of a proper 
filter to remove the harmonics from the test signal. However, at 130 MS/s, the 
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Figure 12.6. Measured spectrum of a 73.3-MHz @ 1.75-Vpp signal sampled at 220 MS/s. The 
aliased signal is seen at the 33.3-kHz frequency and its harmonics at multiples of that frequency. 
The spur at a 25-kHz offset from the signal peak does not originate in the S/H circuit. 



SFDR vs. input level 
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Figure 12.7. SFDR as a function of the signal amplitude at a 130-MS/s sampling rate. 
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SFDR vs. input level 




Vpp [V] 

Figure 12.8. SFDR as a function of signal amplitude at a 220-MS/s sampling rate. 



SFDR was even slightly better with the fs/2 than with the fs / 3 input signal, 
which is probably due to the fact that with the fs / 2 input frequency the voltage 
on the sampling capacitors is almost unchanged between the samples. 

An interesting study is the effect of the switch control voltage on the SFDR. 
The measurement results obtained with a 73.3-MHz signal at a 220-MS/s sam- 
pling rate are shown in Figure 12.9. The SFDR dependence on the control 
voltage is almost linear from 3.2 V to 4.4 V. Increasing the voltage above 4.5 V 
does not give any improvement, which indicates that the distortion from the 
other sources starts to dominate at that level. The 4.5-V control voltage results 
in a 4-V maximum switch transistor gate-drain voltage, which is a couple of 
hundred millivolts larger than the maximum long-term reliable value. 

The measurements with odd sub-sampling ratios revealed that there is a 
spurious tone at fs/2 - /. It most probably originates from the timing skew 
between the parallel circuits. In the worst case, when the input signal frequency 
is from a small offset of the Nyquist frequency, the level of the spur is -61 dBc, 
which corresponds to a 2.6-ps timing skew. 

The power consumption of the circuit without the clock generator was mea- 
sured as 25 inW at a 220-MS/s sampling rate with a fs/ 3, 1.75-Vpp input 
signal. The performance of the circuit is summarized in Table 12.1. 
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SFDR vs. switch control voltage (fdk 220MHz, fs 73.3 MHz) 




Voltage [V] 



Figure 12.9. SFDR of a 73.3-MHz signal sampled at a 220-MS/s rate as a function of the switch 
control voltage. 



Table 12. 1. Measured performance of the double-sampling S/H. 

Sampling rate 220 MS/s 

SFDR 65 dBc 

Differential input swing 1.8 Vpp 
Supply voltage 3.0 V 

Power consumption 25 mW 

Active area 0.06 mm 2 

Technology 0.5-fim CMOS 



3. Timing Skew-Insensitive Double-Sampling S/H 

In order to avoid the timing skew problem another version of the S/H circuit, 
employing the skew-insensitive sampling proposed in the end of Chapter 9, was 
designed and tested. This prototype has been reported in [9]. 



3.1 Architecture 

The architecture of the circuit is a fully differential version of the proposed 
timing skew-insensitive circuit. For the sake of convenience it is shown again 
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Figure 12.10. Timing skew-insensitive double-sampling S/H circuit. 

in Figure 12.10. The building blocks, except the clock generator, (opamp, 
switches, CMFB, etc.) are from the earlier prototype. 

3.2 Clock Generator 

The new clock generator is shown in Figure 12. 1 1. The circuit generating 
the non-overlapping signals is basically the same as used in the first S/H circuit. 
The short pulses for the common sampling switch are constructed with a circuit 
consisting of an inverter, a delay element, and a NAND gate. The D-flipflop 
generates the complementary half-speed clock signals. 

To reduce the jitter in the sampling clock (fis), the buffer chain can be made 
shorter by connecting the clock input of the D-flipflop directly to the incoming 
clock. 

3.3 Simulations 

The effect of timing skew on the first and second S/H circuits is compared by 
adding an intentional 10-ps timing skew in the clock signals. As expected, this 
has no effect on the skew-insensitive circuit. The FFT spectra for both circuits 
calculated from transient simulations are shown in Figure 12.12. The clock 
frequency in the simulations is 220 MS/s and the signal frequency one third of 
that. An error image with a 53-dBc magnitude is seen at the 38-MHz frequency 
in the output of the first circuit. This is exactly as predicted by the theory. There 
is no sign of this image in the spectrum of the skew-insensitive circuit. Except 
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Figure 12.11. Clock generator for the skew-insensitive S/H circuit. 




FREQUENCY (Hz) x10 7 

Figure 12.12. The simulated effect of a 10-ps timing skew on the spectrum of the first (upper 
plot) and the second (lower plot) prototype circuits. 



for the image, the two spectra are almost identical, which indicates that the new 
switching scheme does not degrade the other properties of the circuit. 

3.4 Experimental Results 

The test chip was fabricated with the same 0.5-/um CMOS process as the 
first chip. A photograph of the chip is shown in Figure 12. 1 3. It turned out that 
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Figure 12.13. Photograph of the skew-insensitive S/H prototype. 



an unfortunate mistake was made in the beat frequency test setup design; the 
same programmable divider that was used with the first prototype was applied 
without any modifications. The new clock generator, however, now included 
a frequency division by two and, as a result, the second S/H circuit on the 
chip could be clocked only with the rates f s / 4, /s/6, /s/8, etc. The odd 
sub-sampling ratios would have been needed to investigate the spectrum image 
resulting from the timing skew. This is not possible with the even ratios, since 
then the image aliases at the top of the fundamental signal. 

Although the elimination of timing skew could not be verified with this proto- 
type, the circuit’s performance was measured with the available sub-sampling 
ratios. The results show that the performance is almost identical to the first 
prototype. This proves that the timing skew-insensitive switching does not de- 
grade other circuit characteristics. A spectrum where a 1.8-Vpp, /s/4 signal is 
sampled at 220 MS/s is shown in Figure 12.14. Again, the spurs at the 25-kHz 
offset from the fundamental are due to a poor quality clock source. 

4. 10-Bit, 200-MS/s Parallel Pipeline ADC 

The most promising topology for a high-resolution high-speed CMOS ADCs 
is pipeline architecture. Parallel pipeline ADCs with several time-interleaved 
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Figure 12.14. Measured spectrum of the timing skew-insensitive S/H circuit. The spectrum is 
measured at a 220-MS/s sampling rate with a 1 .8-Vpp input signal having a frequency which 
is a small offset from /s/4. The spurs close to the carrier are part of the spectrum of the used 
clock source. 

component ADCs have been introduced to attain very high sampling rates with 
acceptable power consumption [91, 90]. A resolution of ten bits and conversion 
rates up to 100 MS/s have been reported [96]. Their power dissipation, how- 
ever, has risen very high, especially with high sampling rates. By employing 
double-sampling and parallelism with time-interleaved pipeline ADCs, a very 
competitive power and area consumption can be obtained [95]. 

The goal set for the prototype presented in this section was to demonstrate 1 0- 
bit resolution at a 200-MS/s sampling rate using a 0.5-pm CMOS technology. 
The design has been published in [6] and [5]. 

As discussed in Chapter 4, in a single-channel pipelined ADC the one ef- 
fective bit per stage architecture gives a maximum conversion rate and min- 
imum power consumption when capacitor scaling is not used. It is obvious 
that increasing the number of parallel channels raises the conversion rate and 
lowers the slew rate and bandwidth requirements of the amplifier in a switched- 
capacitor (SC) gain stage. The current consumption of an operational amplifier 
is a nonlinear function of the bandwidth, which suggests that for a given technol- 
ogy there exists an optimum degree of parallelism with respect to the sampling 
rate and power dissipation. For the target 200 MS/s sampling rate the optimum 
number of channels was found to be four [5]. 



Prototypes and Experimental Results 

4.1 ADC Architecture 

The well-known problems in time-interleaved ADCs arise from mismatch 
between the parallel channels. These errors are offset, gain mismatch, and skew 
in the clock signals. Offset is seen as tones at multiples fs/ M, where M is the 
number of parallel channels and fs the sampling rate. Both gain mismatch and 
timing skew generate spectral images of the signal around the same frequencies. 
It is not possible to achieve a 10-bit resolution with the 200 MS/s sampling rate 
without performing special actions to eliminate these errors. 

The most straightforward way to avoid timing skew, also used in this design, 
is to employ a front-end sample-and-hold (S/H) circuit. Digital calibration is 
chosen for the purpose of eliminating the offset, which mainly originates from 
the offset voltages of the operational amplifiers. The gain error, arising predom- 
inantly from capacitor mismatch, is left uncalibrated since it can be adequately 
suppressed by a careful layout design for the 10-bit accuracy requirement. 

A block diagram of the 10-bit 200 MS/s pipeline ADC is shown in Fig- 
ure 12.15 and its clock signals in Figure 12.16. The core of the converter 
consists of two parallel double-sampling pipeline ADCs. The differential ana- 
log input is time-interleaved to the four component ADCs in the order indicated 
in Figure 12.16 by a double-sampling S/H circuit. The digital outputs of the 
stages of the parallel ADCs are corrected and multiplexed to two 100-MHz 
time-interleaved outputs, which are offset compensated. 

4.2 Front-End S/H Circuit 

The front-end S/H circuit is based on the skew-insensitive S/H prototype 
described in the previous section. The only difference is in the switches; to 
improve the linearity and the reliability of the circuit, a separate high supply 
voltage for the switches is not used any more. Instead, the input switches, 
which suffer the most from the insufficient gate overdrive, are realized using 
bootstrapped switches, shown in Figure 12.17 [134]. The same circuit is also 
used as the input switch in the first pipeline stages, which have to track the input 
signal in a quarter of the clock period. 

Since the feedback factor in the flip-around S/H architecture is close to one, 
the S/H circuit achieves twice the speed of a pipeline stage using the 1 .5-bit 
architecture, where the feedback factor is ideally 0.5. Thus, the S/H circuit 
can drive two parallel ADC channels without becoming a bottleneck for the 
conversion rate. Utilizing double-sampling in both circuit blocks allows the 
number of channels to be increased to four. 

4.3 Component ADCs 

The four pipeline component ADCs employ the 1.5 bit/stage topology with 
RSD error correction. The eight pipeline stages are followed by a two-bit flash 
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Figure 12.15. Block diagram of the 4-channel parallel pipeline ADC. 
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Figure 12.16. Clock signals for the parallel pipeline ADC. 



ADC, as indicated in Figure 12.15. The property of the successive pipeline 
stages working in opposite clock phases is exploited by sharing the operational 
amplifiers between two parallel component ADCs. 

In the double-sampling MDAC, shown in Figure 12.18, the capacitor arrays, 
operating in a 180° phase shift, have their own three-level sub- ADCs, which 
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Figure 12.17. Bootstrapped switch. 




Figure 12. 18. Two MDACs share a common opamp. 



minimizes the delay of the switch control signals but doubles the number of 
comparators needed. However, the dynamic comparators used in the sub- ADC 
have a very small power dissipation and area. In the MDAC the switches whose 
timing is critical and the switches connected to the input of the amplifier are 
nMOS switches. In all other switches a more linear response in the signal 
voltage range and minimization of clock feedthrough errors are achieved by 
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Figure 12.19. Folded cascode OTA with high-swing regulated cascode devices. 



utilizing CMOS switches. In the first pipeline stage the input switches tracking 
the output of the S/H circuit are realized with a bootstrapped MOS switch. 

Figure 12.19 shows the opamp, which uses the folded cascode architecture 
with regulated cascode devices. Regulation amplifiers based on a common gate 
input structure make possible the biasing of the cascode nodes at one Vd sa t, 
thus providing high output signal swing. 

Measurements of the first version of the prototype revealed that many of the 
comparators had offsets exceeding the correction range of the RSD logic, which 
in this 1.5 bit/stage architecture is as high as ±200 mV. This indicated that there 
was something wrong with the dynamic comparator circuit, which is based on 
input devices biased in the triode region [71]. Simulations showed that the 
structure is rather sensitive to mismatch in the devices forming the regenerative 
latch. Furthermore, the sensitivity is strongly dependent on the signal common 
mode voltage. 




Figure 12.20 . Differential pair dynamic comparator. 




Figure 12.21. Simplified schematic of the reference voltage driver. 

To get rid of the errors caused by the offsets a new comparator, shown in Fig- 
ure 12.20, was designed and used in the second version of the prototype. It uses 
two differential pairs with pulsed current sources to form currents proportional 
to the difference between the differential reference voltage and differential sig- 
nal voltage. The currents are summed and fed into a latch. The circuit is not 
sensitive to common mode voltages, even when they are not the same in the sig- 
nal and in the reference, and the offset is primarily determined by the mismatch 
of the input devices. This is verified by measurements of separately processed 
test structures [225]. 

4.4 Reference Voltage Driver 

The large number of pipeline stages using common reference voltages in- 
creases the capacitive load in the reference nodes to several picofarads. To guar- 
antee that the reference does not limit the settling speed, its output impedance 
has to be in the order of a couple of dozens of ohms. This means that a resis- 
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tor string implementation of the reference would have a very large quiescent 
current or, alternatively, a large external capacitor has to be used. More rea- 
sonable power consumption without an external capacitor is obtained with the 
circuit proposed in Figure 12.21. There, the low impedance voltage outputs 
are provided by class AB buffers constructed of complementary transistors. 
The buffers for the positive and the negative reference are cascaded in order to 
minimize steady state current consumption. 

4.5 Digital Offset Calibration 

The main source of offset is the input offset voltage of the operational am- 
plifiers utilized in the pipeline stages. Because of the double-sampling there 
is no idle time which could be used to auto-zero the amplifier offset and thus 
the problem has to be handled in some other way. The methods that can be 
used to suppress the offset in parallel ADCs include digital [91] and analog 
[97] calibration and digital post filtering in the case of a two channel ADC [93], 
In analog calibration the offset is measured from the digital output and, using 
a D/A converter, a canceling signal is injected into the channel input. Digi- 
tal calibration does the canceling in the digital domain, simply by subtracting 
the measured offset from each sample. The advantage of the analog method 
is that it does not reduce the signal range. However, the robustness achieved 
with digital calibration usually makes this method preferable, despite the small 
reduction in the signal range. 

In this design the digital offset calibration is applied in the multiplexed half- 
rate output of the double-sampled channel pair. It is assumed that there is no 
significant offset, and thus no need for calibration, between the signal paths 
inside the double-sampled pipeline. This is due to the fact that the opamps, 
which are the main source of the offset, are the same for both the signal channels. 
During the offset measurement the normal operation of the converter has to be 
suspended. The calibration can. however, usually be performed at the power-up 
or during some idle periods. 

The calibration circuit consists of an adder, a register for storing the measured 
offset, and a state machine that provides the control signals for the logic and 
the ADC. During the calibration the ADC input is shorted to ground and the 
offset is obtained by averaging the output signal over 16 clock cycles. The 
averaging is realized simply by adding together 16 consecutive output codes 
and performing a bit shift of four for the result. The calibration is activated 
with an external one-bit control signal. 

The most significant ADC output bits have a strong correlation to the analog 
input signal. This is utilized to investigate the signal feedthrough from the 
output to the input by adding the possibility of scrambling the outgoing digital 
words with a pseudo-random bit-stream [226]. The scrambling is realized by 
putting XOR gates before each output buffer and applying the random bit to 




Figure 12.22. Offset calibration and scrambling of the output. 



Figure 12.23. Parallel pipeline ADC chip micrograph. 
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their other input. For unscrambling, the random bits are taken out through an 
extra package pin. A simplified block diagram of the calibration and scrambling 
circuit is depicted in Figure 12.22. 
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Figure 12.24. Measured DNL and INL. 



4.6 Experimental Results 

The prototype circuit is fabricated using a 0.5-jum triple-metal double-poly 
CMOS process. The total area of the chip is 7.4 mm 2 and its die photograph 
is shown in Figure 12.23. The circuit is measured with a 3.0-V supply with a 
differential input swing of 1.6 Vpp. 

The static linearity curves obtained with the code density test are presented 
in Figure 12.24, which shows the DNL as being within ±0.8 LSB and the INL 
within ±0.9 LSB. A spectrum obtained with a 199.975-MHz beat frequency at 
a 200-MHz clock rate is shown in Figure 12.25. In Figure 12.26 a 71.3-MHz 
full-scale signal is sampled at the full clock rate, resulting in a spurious-free 
dynamic range (SFDR) of 55 dB. From the spectrum it can be seen that the 
offset tone at 50 MHz is more than 56 dBc below the signal level and the 
gain mismatch tones around /s/4 and /s/2 remain below the noise level. The 
mismatch tone in the vicinity of half the sampling frequency usually rises to 
limit the SFDR at high signal frequencies. The total harmonic distortion (THD) 
as a function of signal frequency is plotted in Figure 12.27. THD starts from 
55 dB, becoming about 46 dB around the Nyquist frequency, then rising again 
to the 55-dB level near the sampling frequency, implying that the performance 
of the ADC is limited by the pipeline component ADCs rather than by the S/H 
circuit. 
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POWER SPECTRUM 




Figure 12.25. Spectrum obtained with a 200-MHz beat frequency test. 
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Figure 12.26. Measured spectrum where a 71.3-MHz signal is sampled at a 200-MHz clock 
rate. 
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Figure 12.27. THD as a function of signal frequency. 



Table 12.2. Summarized performance of the ADC. 



Technology 


0.5-fim CMOS 


Resolution 


10 bits 


Sample Rate 


200 MS/s 


Supply Voltage 


3.0 V 


Area 


7.4 mm 2 


DNL 


±0.8 LSB 


INL 


±0.9 LSB 


THD 


46 dB 


Power Dissipation of Core ADC 


280 mW 


Power Dissipation including Output Buffers 


405 mW 



The mismatch tone probably results from improperly-operating common 
mode feedback, which creates asymmetry between the two double-sampled 
pipeline ADCs. This also limits the SNDR at high signal frequencies to 43 dB. 
However, excluding the mismatch tone, the spectral performance indicates that 
the ADC can sample narrow IF bands around 200 MHz with a THD of 55 dB. 
The measured power consumption without and with the digital output drivers are 
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280 mW and 405 mW at a 3.0-V supply, respectively. Table 12.2 summarizes 
the overall performance. 

5. 13-Bit Self-Calibrated IF-Sampling Pipelined ADC 

The increasing bit rates in both wired and wireless telecommunication sys- 
tems are made possible by utilizing wider signal bandwidths. Simultaneously, 
there is a desire to realize an increasing portion of the receiver functions in the 
digital domain. These trends lead to more and more demanding specifications 
for A/D converters. Typically, a resolution of from 12 to 15 bits is needed at 
input signal frequencies of dozens of megahertz and with a sampling rate of 
50 MHz or higher. An important application area is 3rd generation cellular base 
stations, where the current trend is to move the analog-digital boundary to the 
intermediate frequency (IF), which makes the design of the A/D front-end even 
more challenging. 

The requirement to sample an IF signal sets stringent specifications for the 
analog front-end of the ADC. Usually it is realized with a sample-and-hold 
(S/H) circuit which has to be able to track the high frequency input signal. One 
of the major challenges in implementing the S/H is the high frequency linearity 
of the sampling circuit, which is mainly determined by the properties of the 
sampling switch. In IF sampling the signal down conversion is performed by 
the sampling operation and thus the jitter of the LO signal, which is used as the 
sampling clock, must be very low. 

Typically, the matching properties of the circuit elements set the maximum 
attainable ADC resolution (with a reasonable yield) somewhere between 1 0 and 
12 bits. To achieve a higher resolution with a Nyquist rate ADC, some kind 
of calibration or trimming has to be applied. Recently digital self-calibration 
techniques, which are made feasible by the possibility of including more and 
more digital circuitry in the ADC, have gained in popularity. 

Originally published in [I], the prototype presented in this section is a 13-bit 
pipeline ADC incorporating a digital self-calibration algorithm. The ADC has 
a front-end S/H circuit designed to sample signals from a 200-MHz IF. In order 
to cope with thermal noise, the signal range is set as high as 3.8 V differential. 
Because the circuit operates on a 2.9- V supply, the large signal range has a 
major impact on the circuit structures used in the opamps, comparators, and 
switches. 

5.1 Architecture 

The block diagram of the ADC is depicted in Figure 12.28. It consists 
of an IF-sampling front-end sample-and-hold circuit, a self-calibrated 13+2-bit 
pipeline ADC, a delay locked loop (DLL) for synchronizing the sampling of the 
first pipeline stages and the S/H output, and an on-chip calibration state machine 
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Figure 12.28. Block diagram of the self-calibrated IF-sampling ADC. 
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for controlling the S/H circuit and the first four stages of the pipeline during 
the calibration cycle. The two extra bits are used internally for the calibration. 
Redundant sign digit (RSD) coding is exploited to relax the comparator offset 
specifications. Calculation of the calibration coefficients and calibration coding 
is realized with an external FPGA circuit. 

5.2 Front-end S/H Circuit 

The performance of the ADC at high signal frequencies is predominantly set 
by the front-end S/H circuit. Since it is in front of the signal chain, its thermal 
noise and distortion are not attenuated by any preceding gain stages and thus it 
has to fulfill the full resolution requirement. 



Prototypes and Experimental Results 

The S/H circuit, shown in Figure 12.29, is an SC amplifier with a pro- 
grammable gain of 1 or 2. In the unity gain mode the circuit acts as a flip- around 
S/H circuit, where the input voltage is sampled into the capacitors during the 
sampling phase (0,01=1) and in the hold phase (0, 01 = 0; 02 = 1) the capac- 
itors are connected to a feedback loop around the opamp. In the gain-of-two 
mode the sampling phase is unchanged, but in the hold phase one of the ca- 
pacitors is connected to the opamp output and the other to the signal ground 
(0, 01, 02 = 0). Now a charge transfer occurs from the grounded capacitor to 
the feedback capacitor and, as a result, the sampled voltage is amplified by the 
ratio of the total capacitance to the feedback capacitance. The circuit utilizes 
the bottom-plate sampling technique, where the sampling switch (controlled 
with 0) is opened slightly before the input switches (controlled with 01) so as 
to avoid signal-dependent charge injection from the input switches. 

The two modes set different requirements for the opamp; the unity gain mode 
calls for unity gain stability, while the gain-of-two mode doubles the bandwidth 
requirement as a result of a smaller feedback factor. Thus, since the opamp has 
to fulfill both of the requirements, its specifications are more stringent than in 
either mode alone. The input-referred thermal noise in the sampling phase is 
the same in both modes, but the peak signal-to-noise ratio is 6 dB lower in the 
gain-of-two mode as a result of the smaller permissible signal amplitude. The 
total sampling capacitance (2C) is 10 pF, which leaves enough margin to the 
target resolution for the noise contribution of the opamp and the ADC. 

5.3 Opamp 

From the opamp, high gain, wide bandwidth, and low noise are required 
simultaneously. Furthermore, in order to obtain a high signal-to-noise ratio with 
low supply voltage, it is important to maximize the opamp output voltage swing, 
which makes the utilization of a rail-to-rail output stage almost a necessity. The 
need for a high DC gain excludes the possibility of resorting to the traditional 
Miller topology and thus the options are a three-stage architecture or a two- 
stage opamp with a high-gain first stage. The latter is chosen because of easier 
compensation and potentially higher speed. 

The opamp architecture, shown in Figure 12.30 and already discussed in 
Chapter 7, is based on a telescopic input stage and a rail-to-rail output stage. 

5.4 Switches 

The input switches of the S/H circuit are realized with the double-side boot- 
strapped circuit introduced in Chapter 6. Thanks to the triple-well process, the 
switch transistor bulk can be made to track the signal during the tracking phase. 
This yields a remarkable improvement in linearity. Hold-mode feedthrough via 
the switches is suppressed by the canceling technique proposed in Chapter 6. 
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Figure 12.30. High-speed BiCMOS opamp. 



The sampling switches are also realized with bootstrapped circuits, which 
helps to minimize the size of the switch transistor as well as to reduce the 
signal-dependent charge injection due to the small voltage swing over the switch 
on-resistance. 

The common mode settling of the S/H circuit (or a pipeline stage) need not be 
as fast as the differential settling, since the error caused by incomplete settling 
is attenuated by the common mode rejection ratio. However, if the common 
mode voltage is still changing at the end of the hold phase, signal-dependent 
on-resistance of the feedback switches may result in a differential error voltage. 
Thus, bootstrapping is also employed in the feedback switches, as well as the 
input switches and the feedback switches of the two first pipeline stages. 

5.5 Clock Buffer and Clock Generator 

In IF-sampling the signal-to-noise ratio easily becomes limited by jitter, 
because the SNR degradation resulting from jitter is proportional to the rate of 
signal change. For example, a 75-dB SNR at 200 MHz requires a jitter smaller 
than 141 fs. 

Even if the external clock source was ideal (jitter- free), on-chip clock buffer- 
ing can easily add more jitter than allowed. As is known from ring oscillators, 
the jitter is proportional to the total delay without depending on the number of 




R2 
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Figure 12.31. Clock buffer. 



delay elements used. Thus, in this design the target is to minimize the delay 
from the clock pin to the sampling switch by making the buffer chain as short 
and fast as possible. 

The buffer utilized is shown in Figure 12.31. It has a differential input and 
internal common mode voltage biasing, which requires the clock signal to be 
externally DC decoupled. The buffer consists of two stages, a differential pair 
first stage and an inverter second stage. To avoid any additional delay the 
sampling clock (<j>) is not generated with a non-overlapping clock generator, 
but taken directly from the buffer output. The signals needed to control the 
other switches and the opamp are made with a clock generator, though. 

The first stage of the ADC samples the S/H circuit output with a signal whose 
edge falls before the sampling clock of the S/H circuit rises. Since the sampling 
clock cannot be delayed, the clock edge needed by the ADC is generated with 
a DLL which is locked to the sampling clock. 

5,6 DLL 

The architecture of the DLL is shown in Figure 12.32. The variable delay is 
adjusted to be a half of the clock cycle. As a result, the taps of the delay line 
provide evenly-spaced clock edges between the falling and the rising edge of 
the incoming clock signal. 

The circuit consists of two voltage-controlled 16-element delay lines, the first 
of which belongs to the primary loop, which is the only active loop in the locked 
state. The second delay line is used to assure locking to the correct phase. It 
is fed with pulses instead of a continuous clock waveform, which allows the 
detection of the conditions where the primary loop is outside its phase capture 
range and tries to lock to on a wrong clock phase. The coarse control logic 
generates up2 and down2 pulses, which force the primary loop into the correct 
range. Once there, the coarse control becomes inactive. 
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The differential delay element is based on a current-starved two-inverter 
cell, which is controlled with two voltages. The loop filter is built around a 
differential gm-cell. 

5.7 Self-Calibrated Pipeline A/D Converter 

Self-calibration is applied to the first two 2.5-bit stages, while the 9+2-bit 
back-end pipeline ADC employs 1.5-bit stages. A small state machine, which 
generates the MDAC and S/H control signals during the calibration, and the 
RSD correction adders are also implemented on-chip. The 15 uncalibrated 
output bits, along with the raw outputs of the calibrated stages and three control 
signals, are fed out of the chip to the external calibration logic and coding 
circuitry. The ten reference voltages needed in the coarse A/D conversions of 
the stages are generated with an on-chip resistor string, while the references for 
the capacitive MDACs are from off-chip voltage sources. 



Partitioning of the resolution is chosen on the basis of the fact that a high- 
resolution stage in front of the pipeline ADC provides linearity improvement 
as well as power savings in the subsequent stages (see Chapter 4). However, 
each additional bit in the first stage halves the opamp’s feedback factor and 
doubles the number of comparators and their accuracy requirement. The same 
benefits are achieved by having two medium-resolution front-end stages, the 
amplifier and comparator specifications still being reasonable. The back-end 
pipeline should have the minimum stage resolution, so as to minimize power 
consumption. On the basis of simulations on a behavioral pipeline ADC model, 
the target 13-bit linearity can be achieved with two calibrated 2.5-bit stages in 
front of a 1 .5-bits/stage back-end. 

The specifications for the opamps used in the 2.5-bit and 1.5-bit MDACs are 
quite different. The 2.5-bit stages have a small feedback factor, which leads 
to a large open loop GBW requirement for the opamp. On the other hand, 
the opamps do not necessarily need to be stable in the unity gain feedback, 
since they are not auto-zeroed, which relaxes the phase margin specifications 
and thus makes it easier to fulfill the GBW requirements. Also, the DC gain 
has to be larger in the front-end stages, while the accuracy requirements scale 
down toward the LSB stages. The 1 .5-bit stages have a larger feedback fac- 
tor and a smaller DC gain requirement, but, simultaneously, higher stability 
specifications. The opamp topology used in the pipeline stages and shown in 
Figure 12.30 is similar to the one used in the front-end S/H circuit. The first 
two stages have similar opamps, which differ from the opamp used in the S/H 
circuit only in having a smaller compensation capacitor. The 1.5-bit stages all 
have identical opamps, where the bias current and device sizes are scaled down 
by a factor of four from the opamps of the first two stages. In the first pipeline 
stage 2.5-pF unit capacitors are used, resulting in a total sampling capacitance 
of 10 pF. In the second 2.5-bit stage and in the back-end pipeline stages the unit 
capacitors are scaled down to 1 .0 pF. 

The sub- ADCs of the 2.5-bit, 2-bit, and 1.5-bit pipeline stages are of the 
flash type and consist of six, three, or two comparators, respectively. The 
comparators drive the switches of the MDAC and a small decoding logic, which 
converts the thermometer code into binary output. In the calibrated stages 
a multiplexer is added to select between the normal MDAC switch control 
signals and the state machine-produced calibration signals. The low stage 
resolution, together with the RSD correction, allows the use of non-DC-power- 
consuming dynamic comparators. The sub- ADCs utilize the same differential 
pair dynamic comparator (Figure 12.20) as the parallel pipeline ADC described 
in the previous section. 
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Figure 12.33. The MDAC employed in calibrated 2.5-bit stages. 

5.8 Calibration Circuitry 

To meet the 1 3-bit resolution requirement with a high yield the effects of 
capacitor mismatch, low opamp DC gain, and reference voltage mismatch have 
to be compensated for. A widely-used method for 1 -bit/stage pipeline ADCs 
is a digital self-calibration technique, where the discontinuities in the trans- 
fer function of each pipeline stage are serially measured using the back-end 
pipeline, giving correction coefficients to be added to the ADC output during 
normal operation. In the calibration method employed, developed from [80], 
the error attached to each reference unit capacitor is measured separately with 
the back-end stages and, following the switching scheme of the stage, correc- 
tion coefficients for the stage output codes and the offset can be cumulatively 
calculated from these measurement results. 

A simplified schematic of the 2.5-bit MDAC is shown in Figure 12.33. It 
differs from a conventional capacitive MDAC in that it has an extra calibration 
capacitor, the size of which is half of the unit capacitance. The stage under 
calibration samples the signal ground voltage, and in hold mode the capacitor 
being measured is connected to the positive or negative reference voltage, while 
the extra calibration capacitor is connected to the opposite reference voltage and 
the other capacitors to the ground. The resulting deviation of the output from the 
ideal ±V ref / 2 is measured with the back-end pipeline. Similarly, the error of 
the calibration capacitor is measured and taken into account in the calculations. 

Without this extra capacitor, the stage output during the measurements would 
be ±V re f or 0, depending on whether one capacitor or a difference between 
two capacitors is measured. Measuring the differential capacitor errors (com- 




Figure 12.34. A 1 .5-bit MDAC with the option of changing the gain from 2 to 4. 

pared to the feedback capacitor) does not take account of the error resulting 
from the finite opamp gain, while performing the measurement in the vicinity 
of ±V re f easily saturates the back-end stages in the presence of even small mis- 
matches. The extra capacitor eliminates both of these problems by shifting the 
measurement of the capacitor values in the same voltage range — to the vicinity 
of±V^. e y/2, where the transfer function steps are in normal operation. 

As the output of the second stage during the calibration falls in a narrow 
voltage range near ±0 or ±V re f/ 2, the first stages of the back-end pipeline 
are actually used as amplifiers, which makes possible the enhancement of the 
accuracy of the measurement by a factor of four, simply by doubling the inter- 
stage gain of the first two 1.5-bit stages. This is accomplished by halving the 
MDAC feedback capacitor during the calibration, as shown in Figure 12.34. 
Normally, the two halves are tied together, but during calibration one half is 
connected in parallel with the reference capacitor. Two extra stages are added 
to the back-end to get even more resolution when measuring the calibration 
coefficients, as well as to reduce the truncation error in the calibration adders 
during normal operation. The calibrated output is fixed to 13 bits. 

The calibration algorithm is implemented in VHDL. The design allows the 
inclusion of the calibration logic on the same chip as the ADC or its realization 
it with an FPGA, which is combined with the ADC chip on the circuit board 
level. In this prototype the latter approach is used because of its flexibility. 
The calibration state machine controls the reference voltage switching in the 
MDACs of the first two stages during the calibration hold phases. In addition, 
turning off the S/H circuit and switching to the gain-of-four mode in the third 
and fourth stages are also controlled by the state machine. 

The input signals for the calculation and coding logic are the RSD-corrected 
digital words from the converter chip, the raw bits of the first and the second 
pipeline stage, the state-indicating bits, and an external reset signal. The seven 
calibration coefficients per stage are calculated as an average of four measure- 
ments and stored in a memory. During normal operation calibration coding is 
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Figure 1235 . Die micrograph of the S/H prototype. 



simply the addition of two coefficients, which are selected according to the raw 
bits of the two first stages, to each uncalibrated ADC output word. 

5.9 Measurements of the Front-End 

The front-end S/H circuit was first processed as a stand-alone prototype to 
allow its characterization without the ADC and the development of the measure- 
ment setup while the chip with the ADC was still in the fabrication process. The 
chip was fabricated on a 0.35-/im BiCMOS technology with silicon-germanium 
npn transistors and metal -insulator- metal capacitors. Figure 12.35 shows the 
die micrograph. 

5.9.1 On-Chip Circuitry 

The on-chip circuitry for the beat frequency test is shown in Figure 12.36. 
The second S/H circuit is followed by a track-and-hold circuit, since the output 
of the S/H is valid for only one half of the clock cycle and is being reset to 
zero during the other half. The T/H circuit tracks the valid S/H output and 
holds it over the reset phase. The circuit is simply a switch and a capacitor 
followed by a linearized source follower buffer similar (except that the resistor 
is replaced with a current source) to the one employed in the S/H circuit shown 
in Figure 5.5. 
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Figure 1236. On-chip test setup. 
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Figure 1237. Input buffer for low frequency measurements. 



The clock is brought in as a differential signal, which is first amplified and 
converted to a single-ended form. The resulting signal is used directly to control 
the sampling switches of the first S/H (the device under test) and as an input 
for the clock generator of the first circuit. The clock for the second circuit is 
generated from the incoming clock signal by dividing it with a programmable 
synchronous counter ( N E [2, 17]). The phase of the divided clock signal is 
locked properly to the sampling clock with the aid of the DLL circuit. 

5.9.2 PCB 

The test board is a 2-layer PCB in which the bottom layer is used as a 
ground plane. Four versions of the test board were made, one with a socket for 
the chip, and three where the prototypes were directly soldered to the board. 
These boards have different types of input circuitry, one being targeted on low- 
frequency measurements and the other two measurements with an IF input 
signal, the difference being in the clock input; one uses external clock source 
while the other has a crystal oscillator on the board. 
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Figure 12.38. Input structure in IF measurements. 



The low-frequency input structure is shown in Figure 12.37. There, a fully 
differential discrete opamp is used for buffering the signal coming from the 
signal source. The purpose of resistors R1 is to attenuate the glitches produced 
by the switched-capacitor load. Initially, attempts were made to have the buffer 
circuit on a separate board, but the arrangement did not work well, probably as 
a result of too-high interconnect parasitics. 

At the 190-MHz IF frequency the opamp buffer solution is not applicable, 
because opamps with adequate bandwidth and linearity are not available. The 
current spikes resulting from loading the switched sampling capacitor have 
their energy at the clock frequency and its multiples. Thus, providing a low 
impedance path to ground at these frequencies facilitates the job of the signal 
source. This is realized by putting an LC resonator in parallel with the circuit 
input and tuning it to the IF frequency. The drawback of this solution is that it 
is an inherently narrow band, providing only a bandwidth of a few megahertz 
around the resonant frequency (the bandwidth, of course, depends on the Q 
value). Since the load current spikes have a large common mode component, 
the resonator cannot be inserted between the complementary input signals; 
instead, both the inputs require a separate resonator against the signal ground. 

The input circuit, shown in Figure 12.38. also includes matching elements 
(Cl, LI, and Rl), a DC-block, and common mode voltage adjustment (R3, 
which is bypassed with C2 at signal frequencies). 

5.9.3 Equipment 

The input signal from the signal generator is first filtered with discrete LC 
filters in order to get rid of the harmonics. In the IF measurements a 190-MHz 
band-pass SAW filter was also tried. After filtering, the differential signals are 
generated with a power splitter. 

The differential beat frequency output is converted to a single-ended form 
with a traditional instrumentation amplifier circuit constructed from three low- 
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distortion opamps. The instrumentation amplifier has a 50-17 driving capability. 
To achieve low distortion the gain of the amplifier is only 0.25 (to the 50-f] load). 

The clock input is driven directly from a sinusoidal signal source. For lower 
phase noise and jitter it is advantageous to take the clock from a crystal oscillator, 
which was also tested at the 50-MHz clock frequency. 

5.9.4 F unc tionality 

The S/H circuit and the on-chip measurement setup work in a functionally 
correct way with the nominal bias values. The DLL stays locked from ~ 

I 25 MHz to ~ 1 20 MHz and the lock range can be somewhat shifted by changing 

! the DLL bias current. The on-chip measurement setup works well up to a 60- 

1 MHz clock frequency, after which the observed results suggest that there are 

some timing problems at the interface of the two cascaded S/H circuits. The 
exact frequency where the problems begin depends on the bias settings and the 
individual sample. 

5.9.5 Problems and Difficulties 

As described earlier, there were two different approaches to driving in the 
signal, opamp buffer at low frequencies and resonator at IF. The opamp buffer 
works well at low frequencies (10 MHz and below), but at 50 MHz it clearly 
limits signal purity. In between 10 and 50 MHz it is difficult to say whether the 
performance is limited by the chip or the buffer. 

In the IF, the resonator ideally (according to simulations) almost completely 
isolates the current spikes from the signal source. In reality, however, it seemed 
to be very difficult to construct a resonator with a high quality factor, which 
was probably due to PCB parasitics not included in the simulations. The low 
0 resonator attenuates the current spikes, but probably not enough. Thus, the 
results obtained at 190-MHz IF are likely to be partially limited by the purity 
of the input signal rather than the circuit itself. This assumption is supported 
by the fact that the bias adjustments did not have a strong effect on the results. 

The measured transfer function is shown in Figure 12.39. The resonance 
peak is not as sharp as predicted by simulations and neither was its location 
correct with the nominal component values (L=3. 1 nH, C=220 pF). By reducing 
the resonator capacitance to 161 pF the resonant frequency was shifted to the 
desired 190 MHz. 

In the IF measurements the level of the second harmonic was found to be 
rather high and any of the chip’s voltage or current bias settings has virtually 
no effect on it. The only thing that seemed to have an effect, which indeed 
was considerable, was the amplitude of the clock signal. The most probable 
explanation is that there is some coupling from the input signal to the clock. 
As shown in Chapter 8 the coupling results in a spurious signal at twice the 
signal frequency. By changing the amplitude of the sinusoidal clock the slope 
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Figure 1239. Measured resonator transfer function. 



changes and the observed effects seem to be in line with the theory. The effect 
of signal frequency could not be tested because of the narrow bandwidth of the 
input circuitry. 

It was never found out where the coupling actually happened. Adding decou- 
pling capacitors in various places in the PCB and improving the return current 
paths around the signal and the clock traces in the PCB did not have a signif- 
icant effect. So the coupling probably does not happen on the PCB. It may 
occur between the bond wires or on the chip. One possible on-chip coupling 
mechanism is through modulation of the supply voltage. 

5.9.6 Low Frequency Results 

At low frequencies the circuit showed such low distortion that when the signal 
amplitude was ~2.5 dB below the full scale or less the harmonics disappeared 
below the spectrum analyzer noise floor, which was then at the 75-78 dBc level. 

5.9.7 IF Results 

The levels of the second and third harmonics are plotted as functions of 
the output signal amplitude at the 50 and 60-MHz clock frequencies in Fig- 
ures 12.40-12.43. As mentioned earlier, the level of the second harmonic 
depends almost solely on the amplitude of the clock signal. From the 50-MHz 
results it can be seen that, when taking account the 6 dB difference between the 
two gain modes, the second harmonic for a given input amplitude is the same 



Distortion vs. Signal Amplitude 




Figure 12.40. Distortion at the 50-MHz clock frequency. The input frequency is 190.2 MHz 
and the full-scale amplitude corresponds to +3.8 dBm. 



regardless of the S/H circuit gain. The level of the harmonic decreases 6 dB 
per decade faster than the input amplitude, which supports the theory of clock 
contamination. 

It was clearly seen that the phase noise with the crystal clock was significantly 
lower than with a signal generator clock (reference locked to the input signal 
generator). The level of the second harmonic, on which only the clock amplitude 
had some effect, was higher than in the measurements with the signal generator 
as the clock. This is because the clock amplitude could not be adjusted to 
minimize the distortion and probably because the clock signal was single-ended 
unlike the one obtained with the generator. 

Some form of filtering is always needed to remove signal source harmonics. 
All the IF results presented were obtained with a 200-MHz lowpass LC filter. A 
bandpass SAW filter (single-ended) and combinations of the SAW and the LC 
were also tested, but the results were practically unchanged. The problem with 
the SAW filter is its high attenuation (~9 dB), which prevented measurements 
with signal amplitudes close to the full scale, because the signal generator 
amplitude was already almost at its maximum without the SAW device. 
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Figure 12.41. Distortion at the 50-MHz clock frequency taken from an on-board crystal oscil- 
lator. The input frequency is 190.1 MHz and the full-scale amplitude corresponds to +3.6 dBm. 
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Figure 12.42. Distortion at the 50-MHz clock frequency taken from an on-board crystal oscil- 
lator. The input frequency is 140.1 MHz and the full-scale amplitude corresponds to +3.6 dBm. 
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Figure 12.43. Distortion at the 60-MHz clock frequency. The input frequency is 190.2 MHz 
and the full-scale amplitude corresponds to +3.8 dBm. 



5.10 ADC Experimental Results 

The circuit containing the whole system shown in Figure 12.28 was fabricated 
with the same 0.35-//m BiCMOS technology as the stand-alone front-end, and 
the chip was packaged in a 52-pin VFQFPN package. The benefits of this almost 
chip-scale package are small parasitics and also a low ground inductance and 
good thermal conductivity, which are thanks to the die attachment to a large 
pad exposed through the package. 

The prototype was measured using a 4-layer PCB, the FPGA being on a 
separate board. A common ground plane was used for the analog and the 
digital circuitry. The clock signal from an on-board 50-MHz crystal oscillator 
was connected to one of the differential clock inputs, while the complementary 
input was grounded near the oscillator. The arrangement for bringing in the IF 
signal was identical to the one used with the stand-alone S/H prototype. 

The first measurements revealed that there was a timing error in the digital 
delay line used for aligning the output bits of the pipeline stages. As a result the 
output contained a large number of seemingly random bit errors. It was found 
that these errors could be minimized, but not totally eliminated, by lowering 
the supply voltage from the designed 3.0 V to 2.9 V and performing the testing 
at a temperature of +5°C. Due to the remaining bit errors the noise floor was 
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Figure 12.44. 1 3-bit 50-MS/s pipeline ADC die micrograph. 



high, limiting the SNDR to 55-60 dB. On the other hand, the linearity did not 
seem to suffer significantly. 

The static linearity was measured with a 195.2-MHz signal and calculated 
using the code density test. Figure 12.45 and shows the results before and 
after the calibration cycle. The calibration improves the maximum INL from 
±7.1 LSB to ±3.0 LSB, the DNL being within ±1.0 LSB in both cases. 

The SFDR measured with a - 1-dBFS (3.4 Vpp differential) signal was better 
than 73 dB in all frequencies in the range from 190 MHz to 200 MHz. An 
example spectrum with a 76.5-dB SFDR is shown in Figure 12.47. The effect 
of calibration can be seen by comparing it with Figure 12.46, where the same 
measurement is repeated without the calibration. A two-tone test, performed 
with the S/H circuit in the gain-of-two mode, is presented in Figure 1 2.48. The 
measured power consumption from the 2.9- V supply was 715 mW. Table 12.3 
summarizes the ADC performance. 

The measurements showed a greatly improved second harmonic at IF com- 
pared to the stand-alone version of the S/H circuit. The possible reasons for 
the reduced clock contamination are the larger separation of the input and the 
clock pads, smaller bond wire and package inductances thanks to the VFQFPN 
package, and reduced coupling on the board level resulting from 4-layer PCB. 
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6. Deglitcher for Current Steering DACs 
6.1 Introduction to Current Steering DACs 

Traditionally the applications of high-speed DACs have been in video and 
computer graphics applications, but recently the migration to wideband wired 
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Figure 12.46 . A spectrum measured with a 194.2-MHz, -1-dBFS signal before calibration. 
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Figure 12.48. A two-tone spectrum obtained with — 7-dBFS signals centered at 190 MHz. The 
S/H circuit is in the gain-of-two mode. 




Figure 12.47. A spectrum measured with a 194.2-MHz, -1-dBFS signal after calibration shows 
a 76.5-dB SFDR. 



Table 12.3. Summarized ADC performance. 



Resolution 
Sample Rate 

Input Range (differential) 
Input Bandwidth 
DNL / INL (calibrated) 
SFDR (@ 200 MHz) 
Supply Voltage 
Power Dissipation 
Die Area 
Technology 



13 bits 
50 MS/s 

3.8 V 

> 200 MHz 
±1.0/±3.0LSB 
76.5 dB 

2.9 V 
715 mW 
6.0 mm 2 

0.35-/im BiCMOS (SiGe) 



and wireless telecommunication standards and the evolution of radio transmitter 
architectures toward the software- defined radio have created a need for high- 
speed, high-resolution telecommunication DACs. 

In the past the research and development of DACs have been heavily concen- 
trated on improving the static (DNL, INL) and, to some extent, the time domain 
specifications (settling time, glitch area), almost totally neglecting spectral pu- 
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rity and other frequency domain characteristics that are essential in telecom- 
munication devices. 

Practically all high-speed DACs are based on the current steering architec- 
ture, one of the main reasons for this popularity being its capability of driving 
resistive loads without buffering. A 14-bit static linearity has been achieved by 
using trimming, self-calibration [227], and even intrinsically [228]. A typical 
problem in these DACs is the rapid increase in harmonic distortion when the 
signal frequency is increased. This is mainly due to the glitches occurring at 
the code changes. The glitches are results of incoherent timing of the current 
switches, non-optimal shape of the switch control waveforms, and coupling of 
the digital signals to the analog output. 

Attempts to reduce glitches include the use of latches to synchronize the 
switch controls, circuits to generate optimal control waveforms for the switches, 
and the use of return- to- zero- type output to suppress the output during the code 
changes [229]. The return- to-zero technique utilized in [230] yields a clear 
improvement in high-frequency SFDR compared to earlier reported DACs, but 
still has some limitations, such as the difficulty of providing large amplitudes 
to a low-resistance load, the complicated circuitry needed to handle signal 
dependent parasitics, and sensitivity to clock jitter, which is not relaxed, unlike 
in conventional DACs, when signal frequency is decreased. To alleviate the 
first two of these problems the same authors have proposed the track/attenuate 
technique [227], which is basically a switch put in parallel with the load to short 
the output during the DAC switching. 

To avoid the jitter problem and signal attenuation it is possible to use a track- 
and-hold circuit as a deglitcher; the DAC is cascaded with a T/H circuit which 
tracks the DAC output when it is in steady state and holds a sampled voltage 
during DAC settling. Although the deglitcher does a good job of removing 
code-dependent glitches it typically cannot achieve as high a speed as a current- 
steering DAC and, furthermore, the voltage output provided by the T/H needs 
to be buffered in order to drive resistive loads. 

The DAC presented here and published in [14] and [15] employs a deglitcher 
which is based on current mode circuitry. The output is provided in the form 
of a current which eliminates the need for a buffer. A high speed is achieved 
by employing parallelism. The circuit does not rely on matching, which makes 
it very robust and eliminates the need for calibration. 

6.2 Circuit Description 

6.2. 1 Architecture 

The core DAC, shown in Figure 12.49, is based on segmented architecture. 
The current sources are divided into two unit current source arrays, with 8 LSBs 
in one and 6 MSBs in the other. In the LSB array the current sources are binary- 
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Figure 12.49. Block diagram of the DAC. 



weighted and constructed of parallel unit sources distributed around the matrix 
to compensate for linear and center-symmetric process variations. In the MSB 
matrix the two least significant bits are binary-weighted and the remaining four 
are formed with 15 unweighted sources constructed of four unit sources laid 
out in common centroid geometry. To reduce cumulative mismatch errors, the 
consecutive unweighted sources are selected in such a manner that if one source 
is constructed of transistors on the periphery of the matrix, the next one will 
have its transistors closer to the center, and vice versa. 

The current bias for the LSB and MSB arrays is generated in a bias array 
from a single external reference current. The bias array is a current mirror that 
generates two currents, the LSB bias being 1/64 and the MSB bias 4 times the 
reference. The current ratio can be manually trimmed with an external 5-bit 
control signal. 

The current switches are controlled with a 10-bit binary code and a 16- 
level thermometer code, which are synchronized with a latch stage before the 
switches. 

The DAC output is connected to the deglitcher, which requires a 1.75-V 
voltage headroom; thus, the DAC has to fit within 1.25 V when a 3.0- V supply 
voltage is used. This has an effect on the sizing of the current sources and the 
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DAC switches; the current sources are biased to a 1.2-V gate-source voltage 
to minimize the effect of threshold voltage variation in the available voltage 
headroom. The single-ended full-scale output current of the DAC is 10 mA. 

The clock signal is brought into the chip in differential form to improve the 
noise rejection on the board and package level. 

6.2.2 Deglitcher 

The principle of the deglitcher is shown in Figure 12.50. It consists of four 
single-ended current mode sample-and-hold circuits and four switch pairs that 
operate in time-interleaved fashion. During one clock cycle the DAC differential 
output current is sampled into two current memories and the other two supply 
to the output the current which has been sampled in the previous clock cycle. 
This way the DAC is never directly connected to the output and the sampling 
phase is extended to cover nearly the whole clock cycle, maximizing the speed. 

In the sampling phase the current memory is enclosed in a feedback loop that 
forces its current to be equal to the DAC output current. At the end of the phase 
the feedback loop is opened and the current gets sampled in the memory. In 
the hold phase the current memory acts as a current source, whose value is the 
one stored in the memory. In this phase the circuit is connected to the external 
load. Since the current value is set by feedback and kept unchanged when the 
circuit is connected to the load, no accurate matching is required between the 
current memories. 

The clock waveforms for the circuit are shown in Figure 12.51. The DAC 
is clocked at the full rate, while the deglitcher uses a set of half-rate clocks. 
The current switches are controlled with the complementary signals Clk+ and 
Clk -, which have a 50% duty cycle. Signals Clkl and Clk2 are non-overlapping 
clocks for the current memories. Two currents 13 and 14 are also shown to 
clarify the operation. 

A more detailed implementation of the deglitcher is shown in Figure 12.52. 
There, the current switches are implemented with bipolar transistors and a 
cascode transistor is inserted between the current switch and the circuit output. 
In addition, a cascode current source is added in parallel with the DAC to bias 
the current memory. The base currents of the switch transistors — as long as the 
transistors in the switch pair are matched — are not a problem, since the switches 
are enclosed in the feedback loop when the current is sampled. 

6.2.3 Current Switches 

The bipolar current switches are driven with the circuit shown in Figure 12.53. 
It uses two voltages Vu and Vl, to which the bipolar transistor base is connected 
in the on- and off-phases, respectively. The switches S 1 through S4 are boot- 
strapped MOS switches similar to the one in Figure 6.18. These switches are 
controlled with overlapping signals dl and d2 , which, together with their com- 
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Digital input 




Figure 12.51. Clock signals. 



plements, are shown in Figure 12.53. Also shown are the switch transistor base 
voltages, which cross near the high voltage level to avoid turning off both the 
transistors simultaneously, which would disturb their common emitter node and 
produce a glitch in the output. The crossing point depends on the overlap time 
and is adjusted to be too high rather than too low so as to guarantee desired 
operation under all process conditions. 

6.2.4 Current Memory 

Typically, the current memories are based on a MOS transistor whose gate 
capacitance is used as an internal storage element, while the memory input 
and output are both the drain current [231]. Such a basic circuit is shown in 
Figure 1 2.54 [232]. There, the transistor Ml acts as a voltage-controlled current 
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Figure 12.53. Current switch and its timing. 



source, the current of which is set equal to Ijn during the Clk high phase. This 
is accomplished by connecting the transistor in a diode configuration through 
the MOS transistor switch M2. As a result Ml’s gate voltage Vo settles to 
a value which makes the drain current equal to I in- When Clk goes low, 
M2 turns off and the voltage Vg is sampled in the capacitor Cl, which can be 
the gate capacitance of Ml or a combination of the gate capacitance and an 
additional capacitor. Now Ml acts as a current source equal in value to Ijn at 
the sampling instant. 

As regards high resolution applications, the most severe limitation of this 
circuit is the harmonic distortion originating from the sampling switch charge 
injection [233]. The nonlinear relationship between the gate voltage and the 
drain current results in a situation where even a constant charge injection from 
M2 produces harmonic distortion in the output current. Moreover, the switch 
M2 operates against the voltage Vg, which makes the charge injection signal- 
dependent. Another problem is the limited output impedance of M 1 , which 
also distorts the output current if the output voltage during the hold phase does 
not match the voltage in the sampling phase. 
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Figure 12.54. Basic current memory [232]. 



The output impedance can be substantially improved simply by cascoding 
Ml with another device. Ways of reducing the effect of the charge injection 
range from using differential circuitry or dummy switches to the very accurate 
S 2 I technique [234], where the sampling is done with two memory elements 
in two phases; first, a coarse sample is taken in the larger memory and in the 
next phase a correcting fine sample is taken in the other one. As a result, 
the remaining error is proportional to the size of the small fine memory. The 
disadvantage of this technique is the increase in sampling time introduced by 
the added second sampling phase. 

In this design the approach taken to achieve 14-bit resolution is two-fold. 
First, the linearity of the current memory is maximized to make the circuit less 
sensitive to the constant charge injection, and second, the signal dependency 
of the charge injection is significantly reduced. Furthermore, when the current 
memory is linear, even an error linearly dependent on the signal can be tolerated, 
since it only affects the signal amplitude. 

In [235] the signal-dependent charge injection is avoided by adding an opamp 
in the feedback loop to create a virtual ground at the drain of M l . The sampling 
switch is moved from Ml’s gate to the virtual ground, which makes its charge 
injection independent of the signal. The extra element in the feedback loop, 
however, inevitably increases the settling time. For this reason the approach 
taken in this design is to improve the switch itself. 

The current memory is shown in Figure 12.55. The switch transistor gate- 
source voltage is made virtually constant by using bootstrapping. Neglecting 
the bulk effect, this makes the channel charge, as well as the injection resulting 
from its release, constant. The actual switch realization is based on the circuit 
presented in [134] and shown in Figure 6.18. The maximum gate overdrive 
yields a low on-resistance with a small switch transistor, minimizing the non- 
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Figure 12.55. BiCMOS current memory. 



linear parasitic capacitances, which are effectively in parallel with the memoiy 

capacitor. ... . 

Besides the channel charge, there is also charge redistribution in the gate 
overlap capacitance C Go i. Since, when entering the off-phase, the switch gate 
is pulled to a constant voltage, the resultant signal-dependent error charge is 
CgoI • Vg- When the capacitances are constant (which is mostly true) and 
the Current memory linear the error results in only a small change in signal 
amplitude. If necessary, this error could be avoided by switching the gate to a 
voltage that is the V G properly buffered and level-shifted. 

The high linearity of the current memory is based on the fact that the transistor 
Ml is biased in the triode, not the saturation, region. There, the drain current 
is given by 2 _. 

In = r£^w {VGs _ VT)VDS _Ym . (12.1) 

Now, if the drain-source voltage is kept constant, the circuit is perfectly linear. 
To make the voltage on the drain as constant as possible Ml is cascoded with the 
bipolar transistor Q 1 , which has an inherently large gm, which is further boosted 
by using regulation. Regulating a bipolar cascode transistor, in contrast to a 
MOS transistor, does not improve the output impedance [236]. But, as already 
said, that is not the main reason for the regulation here. To achieve a high 
linearity it is necessary to bias the transistor M 1 deep in the linear region. The 
cascode current source, consisting of M2 and M3, is for biasing Ml. 

The sampling speed of the deglitcher is determined by the time constants 
associated with the feedback loop. When the loop is broken at the gate of Ml 
(leaving the capacitor Cl on the output side) there is only one high impedance 
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node (the DAC output), which justifies the use of the single pole approximation, 
giving the following gain-bandwidth product: 

GBW = - ^ , (12.2) 

where g m \ is the transconductance of Ml and Cdac the DAC output capaci- 
tance. Since the transconductance (together with the full-scale output current) 
determines the voltage swing on the memory capacitor and the capacitor value 
the sensitivity to charge injection and noise, there is a tradeoff between speed 
and accuracy. The sampling switch on-resistance and the other nodes in the 
loop produce non-dominant poles that affect the phase margin. These poles are 
given by 



and 



p2 



1 CiCpAC 
Ron Ci + Cdac' 




(12.3) 

(12.4) 




(12.5) 



The auxiliary amplifier used in the regulated cascode transistor is shown in 
Figure 12.56. It consists of a transresistance input stage and an emitter follower 
buffer, which is needed to supply the large base current of the cascode BJT. 

The current memory is biased via the regulation amplifier. The bias circuit 
is shown in Figure 12.56. It uses a scaled-down replica of the current memory 
(Ml and II), the input voltage (Vmem) of which is set at the desired level 
(1.5 V) with R1 and R2. This arrangement sets the drain voltage and, more or 
less, the transconductance of Ml correctly regardless of the process parameters 
and temperature. The drain voltage is used to generate the correct gate bias Vbb 
for the regulation amplifier. 



6.2.5 Limitations 

A potential problem in all circuits using time interleaved parallelism is mis- 
match between the parallel circuits. The current memory-based deglitcher, 
however, does not rely on matching, thanks to the use of the current copying 
principle. The only parallel element not enclosed in the feedback loop in the 
sampling phase is the current switch. However, being constructed of bipolar 
devices, its matching should be adequate. Although the component matching 
is not a problem, the timing of the current switch can be. Any deviation from 
the 50% duty cycle will produce a spectral image of the signal around the half 
clock frequency. The error is signal frequency-dependent, getting worse as the 
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Figure 12.56. Regulation amplifier (left) and the bias circuit (right). 



frequency is increased. In most applications some amount of over-sampling is 
used, which moves the image outside the signal band and makes it tolerable up 
to some limit. 

To minimize the error the half-rate clocks for the current switch control cir- 
cuit are generated by dividing the full-rate clock with a carefully matched syn- 
chronous divide-by-two circuit constructed with a fully differential D-flipflop. 
For prototyping purposes a manual rise time control circuit was included in the 
signal path for fine-tuning the duty cycle if it turned out to be necessary. 

6.3 Simulations and Experimental Results 

The circuit was designed using a 0.35-//, m BiCMOS (SiGe) technology. The 
chip, a photograph of which is shown in Figure 12.57, occupies a total of 
5.7 mm 2 of silicon area. The majority of the area is consumed by the DAC 
current sources, the deglitcher being only a small block on the lower right comer 
of the chip. A minor layout error in the first processed version, unfortunately, 
prevented any decent performance figures from being obtaining. 

A simulated spectrum with an 1 1 ,4-MHz input signal and a 40-MHz clock 
frequency is shown in Figure 12.58. The highest spurious, the third harmonic, 
lies 86 dB below the signal and the level of the fifth harmonic is -92 dBc. 
No even order harmonics can be seen, thanks to the differential circuitry. The 
power consumption from a 3.0- V supply is 370 mW and dominated by the 
deglitcher with its 70% share. 

6.4 Conclusions 

The idea of using a current mode track-and-hold circuit as a deglitcher after a 
current-steering DAC has been proposed and demonstrated with a prototype cir- 
cuit. The time-interleaved deglitcher uses the current copying principle, which 
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Figure 12.59 . Block diagram of the realized ADC. 



makes it insensitive to component mismatch. The current memory developed 
achieves exceptionally high linearity thanks to a bootstrapped sampling switch 
and a transconductor constructed of a triode region MOSFET cascoded with 
a regulated bipolar transistor. According to the simulations, a 14-bit dynamic 
accuracy is achieved at a 40 MS/s sampling rate. The design presented demon- 
strates that a current mode track-and-hold circuit can be successfully used for 
removing glitches from the output of a high-speed current-steering DAC and, 
as a result, a significant improvement in dynamic performance can be expected. 

7. 1st Switched Opamp Pipelined ADC 

7.1 Introduction 

It has been demonstrated that CMOS ADCs implemented in current mode 
techniques are capable of operating with supply voltages of 1 .5-V and below 
[237, 238]. Current mode circuits, however, seem to have limited linearity, 
especially at higher signal frequencies. The switched capacitor (SC) technique, 
which has an inherently good linearity, has been widely employed in pipelined 
ADCs operating on supply voltages above 2.5 V, but the insufficient switch 
overdrive prevents it being used for low-voltage applications in its standard 
form. The switched opamp technique, which is one of the low-voltage modifi- 
cations of the SC technique, has, for the first time, been applied to a pipelined 
ADC in the prototype described in this section and originally published in [8], 
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7.2 ADC Architecture 

The 9-bit ADC is realized with the standard 1 .5 bits-per-stage pipeline archi- 
tecture, where the 0.5 bit redundancy in each stage is used for digital correction 
to relax the requirement for comparator offsets. A block diagram of the ADC 
is shown in Figure 12.59. Each pipeline stage performs a coarse (in this case 
three-level) A/D conversion for its input signal and passes the amplified quan- 
tization error to the next stage. The quantization error (or residue) is formed 
by converting the quantization result back to analog form and subtracting it 
from the input signal. The residue formation and its precise amplification are 
performed by a multiplying digital-to-analog converter (MDAC). 

The operation of the pipeline stage consists of two phases each lasting half a 
clock cycle. In the first phase the MDAC samples the input signal and the sub- 
ADC does the A/D conversion. During the second phase the MDAC generates 
and amplifies the residue, yielding the input signal for the next stage. The 
successive stages operate in opposite phases and thus the conversion of a sample 
traverses two stages in a clock cycle. 

There is a total of seven stages, like the one whose block diagram is shown 
m the inset of Figure 12.59. Since the last stage does not need to generate a 
residue, it is implemented as a 2-bit flash ADC consisting of three comparators 
and a small number of logic gates. In the reported measurement results the 
originally 9-bit output is truncated to 8 bits. 

7.3 MDAC 

The operation of the MDAC consists of two phases. During the first phase 
the input signal is sampled, while the second one is reserved for the subtraction 
and the amplification. The MDACs in successive stages operate in opposite 
clock phases. 

The MDAC is shown in Figure 12.68. In contrast to its SC counterpart, 
the feedback capacitors are permanently connected around the amplifier and 
the input capacitors to the output of the preceding stage. When the MDAC is 
sampling, the inputs and the outputs of the opamp are connected to V DD . In 
transition to the amplification they are released, and simultaneously the pre- 
ceding stage pulls the MDAC inputs to V DD . Consequently, the charge in the 
input capacitors is transferred to the feedback capacitors. The D/A operation 
is realized by connecting the 2 C valued capacitors to the reference voltages 
according to the bit code produced by the sub A/D converter. The purpose of 
the C valued capacitors, also controlled by the bit code, is to keep the opamp 
input common mode level at Vdd- 

The Miller-type switchable opamp employed in the MDAC has already been 
described in Chapter 10. 








218 CIRCUIT TECHNIQUES FOR LOW-VOLTAGE AND HIGH-SPEED ADCS 



C 1 / 




c 



Figure 12.60. The multiplying digital-to-analog converter. 




Figure 12.61. Differential comparator. 



7.4 Comparator 

The comparator (Figure 12.61) consists of a preamplifier and a latch. The 
preamplifier is realized with a differential nMOS pair driving resistor loads. 
The latch, shown in Figure 12.62, consists of an n-type input pair and a cross- 
coupled pMOS load with reset switches in parallel. The clock signal controls 
a switch between the common source node of the input pair and the ground. 
Although the ADC employs digital correction, careful balancing of the load 
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Figure 12.62. Latch of the comparator. 



capacitance in the complementary outputs is needed to guarantee an offset 
within the tolerable range. 

The comparator inputs are connected to the outputs of the previous stage, 
where the signal common mode level is Vdd/ 2. Since this voltage cannot 
be applied directly to the gate of a transistor, the signal is level-shifted with 
capacitors. This allows the common mode level at the preamplifier input to 
be set to Vdd* The digital error correction permits rather large error at the 
comparator decision level, and thus the reference is built into the values of 
the capacitors Ci and C 2 . The error correction also makes it possible to latch 
the comparator before the output of the driving stage is fully settled. This is 
utilized to have the comparison result ready at the time when the MDAC begins 
the amplification. 

7.5 Input Buffer 

The active input buffer described in Chapter 10 is utilized in this prototype. 

7.6 Experimental Results 

The prototype circuit is fabricated using a 0.5-jim CMOS process with three 
metal and two polysilicon layers. Its die photograph is shown in Figure 12.63. 
The total area of the chip is 3.8 mm 2 . 

Figure 12.64 shows the results of DNL and INL measurements. The DNL 
errors are within 0.6 LSB. The INL curve obtained is characteristic of all the 
measured samples. There, the large errors at the edges clearly indicate that 
the input buffer is not linear enough at the verges of the signal range (1.2 V 
differential). The INL error in the midrange, however, is very small, suggesting 
that the A/D itself works well. The INL figures reported in the conference paper 
[8] were calculated with a code density test program which contained an error. 
The figures obtained with a corrected version of the program are much better 
than the incorrect ones reported in the paper. 
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Figure 12.63. Die photograph of 1st switched opamp ADC. 

Differential linearity 




Figure 12.64. Measured static linearity. 

The SNDR measured as a function of input signal amplitude is shown in 
Figure 12.65. The measurement was performed using a 200-kHz sinusoidal 
input signal and an 8-MHz clock. The measured SNDR has a peak value of 
44.7 dB, which occurs when the input amplitude is 65% of the full scale. The 
deterioration of the SNDR with larger amplitudes is explained by the INL curve. 
The peak SNDR corresponds to 7.1 effective bits. 
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Figure 12.65. SNDR versus signal amplitude. 



A spectrum with a 990-kHz 0.6- Vpp input signal is shown in Figure 12.66. 
The largest spurious component is the third harmonic, which is at the -52-dBc 
level. The SFDR, which is dominated by the third harmonic, is presented as 
a function of input frequency in Figure 12.67. It is measured at 5-MHz and 
8-MHz clock rates using an input signal whose amplitude is 50% of the full 
scale. Although the 5-MHz clock gives a roughly 5 dB better SFDR than the 
8-MHz clock, the difference in SNDR is much smaller. 

The circuit operation was verified with supply voltages ranging from 1.0 to 
1.2 volts. All the presented results were measured with a 1.1 -V supply. The 
power consumption at the 8-MHz clock rate with a 200-kHz full scale input 
signal is 7.8 mW. 

8. 2nd Switched Opamp Pipelined ADC 
8,1 Introduction 

This section describes another switched-opamp implementation of a pipelined 
ADC, which is also published in [4] and [3]. The overall converter architecture 
is similar to that in the previous section, but most of the circuit blocks have been 
redesigned to achieve a more robust and less power consuming realization. 
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POWER SPECTRUM 




Figure 12.66. Measured spectrum where the signal frequency is 990 kHz and amplitude 50 
percent of the full scale. 



SFDR vs. Signal Frequency (Signal: 50% full scale) 




Figure 12.67. Measured SFDR versus signal frequency. 
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8.2 MDAC 

The developed fully differential SO MDAC is shown in Figure 12.68. In con- 
trast to its SC counterpart, the feedback capacitors are permanently connected 
around the amplifier and the input capacitors to the output of the preceding 
stage. Due to this the maximum achievable feedback factor for the MDAC in 
a 1.5 bits-per-stage architecture is 1/4, while being 1/2 in the SC realization, 
which makes the SO circuit inherently slower than the SC one. 

To understand the circuit’s operation let us first look at the DC common 
mode voltage levels in the circuit. The virtual ground at the opamp input is 
set to Vdd > which is a suitable operating point for an opamp with nMOS input 
transistors. The voltage level is kept unchanged when switching to the sampling 
phase by shorting the opamp inputs to Vdd • To maximize the voltage swing the 
signal at the opamp output is centered in the middle of the supply rails. During 
the sampling phase the opamp output is in a high impedance state and pulled 
to Vss by the attached switches. As a result there is a Vdd/ 2 change in the 
voltage level when switching from one phase to the other. Since the MDAC 
input is connected to the output of the preceding stage, its voltage levels follow 
a similar pattern, but in the opposite phase. 

In the sampling phase— when the clock (j) is high — the inputs are sampled 
to the 4C-valued input capacitors. At the same time the signal voltages on the 
reference and the feedback capacitors are reset. When entering the amplification 
phase, the switches at the opamp inputs and outputs are opened. Slightly later, 
the preceding stage pulls the MDAC inputs to Vss- With the aid of the virtual 
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Figure 12.69. A half circuit of the differential input interface and a part of the first pipeline 
stage. 



ground the sampled signal charge in the input capacitors is transferred into 
the feedback capacitors, resulting in an output voltage that is the input voltage 
multiplied by the capacitor ratio. 

The DAC function is realized with (7-valued capacitors, which are connected 
to either V REF + or Vref- according to the two-bit binary code produced by 
the stage’s sub- ADC. The DAC output, which is added to the sampled voltage, 
has three possible output values: +Vref, -Vref, and 0. The first two are 
achieved by connecting both the capacitors to the same reference voltage, while 
to get the zero they are connected to the opposite voltages. The advantage of 
this two-capacitor DAC compared to the DAC in the previous prototype, with 
a single 2C-valued capacitor, is the fact that the common mode level for the 
zero code is the same as for the other codes. This eliminates the extra switched 
capacitor usually needed in SO circuits to compensate for the common mode 
level change between the two operating modes. 

The minimum supply voltage for the MDAC depends on the gate-source 
voltage needed to properly turn on the switch transistors and the values of 
the reference voltages. The reference voltages in turn determine the full-scale 
signal amplitude, which is equal to the signal swing at the opamp output. This 
requires the reference levels to be set at least a saturation voltage Vd sa t apart 
from the supply rails. 

The MDAC uses the third switchable opamp proposed in Chapter 10. 

8.3 Input Stage 

The passive input interface developed for this SO circuit has already been 
explained in Chapter 10. For the sake of convenience it is shown again in 
Figure 12.69. 




Figure 12.70. Fully differential comparator. 



The values used for C, Cs, and R are 5 pF, 0.44 pF, and 1 .5 kft respectively. 
A total harmonic distortion of 70 dBc is predicted by a simulation made using 
a 5-MHz clock rate and a 1.7-MHz sinusoidal input signal with a 1.2-Vpp 
differential amplitude. 

8.4 Comparator 

In the 1.5-bits-per-stage pipeline architecture there is one redundant quan- 
tization level in each sub- ADC which, together with the digital correction, 
permits ±V RE f / 4 (±150 mV in this design) inaccuracy in the comparator de- 
cisions. Consequently, the comparator can be a fairly simple dynamic circuit. 
Since the analog signal path is fully differential, it is desirable that the com- 
parator has a fully differential input, which prevents the use of conventional 
single-ended comparator topologies where the signal is applied to one input 
pin and the reference voltage to the other. One possible way to realize a fully 
differential comparator is to employ a charge summation circuit. The imple- 
mentation used, e.g. in [102], consists of a latch stage preceded by coupling 
capacitors that are precharged to the reference voltages during the reset phase. 
In the acquisition phase the latch input experiences a voltage that is the sum of 
the input and the reference; as a result, the comparator decision level is equal to 
the reference voltage. The implementation used here, shown in Figure 12.70, 
relies on the same type of architecture. 

Due to the switched-opamp implementation of the MDAC, the input capac- 
itors cannot be disconnected from the output of the previous pipeline stage. 
Thus, another pair of capacitors is needed for adding the reference voltages. 
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During the reset phase the previous converter stage pulls the input capacitors 
to Vss and the inputs of the latch are reset to Vdd- At the same time the refer- 
ence capacitors are connected to Vdd ■ When the driving pipeline stage starts 
its amplification phase, the reset switches of the comparator are released and 
the resistor string, to which the reference capacitors are attached, is connected 
between the global positive and negative reference voltages. The resistor string, 
which is common for all the comparators in the same pipeline stage, provides 
the voltages +Vref / 4 and -Vref/ 4, setting the comparator decision level 
accordingly. 

There are two reasons why the quarter reference is realized with a resistor 
chain instead of simply scaling the reference capacitor values. First, when using 
equal size input and reference capacitors, the common mode voltage level in the 
latch input is automatically set to V DD . Otherwise, an extra pair of capacitors 
would have been needed to correct the voltage level. The second reason is the 
desire to keep the capacitors as small as possible. The capacitor value C/4 
would have been too small to implement without increasing the absolute value 
of the unit capacitor C. 

The dynamic latch is similar to the one used in the first version and shown 
in Figure 12.62. The digital error correction makes it possible to trigger the 
comparator before the output of the driving stage is fully settled. This is utilized 
to make sure that the A/D conversion result is ready at the time when the MDAC 
begins the amplification. The comparators are synchronized with the MDAC 
by adding a digital latch stage after them. 

8.5 Experimental Results 

The prototype chip is implemented in a 0.5-//,m triple-metal double-poly 
CMOS technology with 610-mV Vth for both nMOS and pMOS transistors. 
A photograph of the die, witch has an active area of 1.3 mm 2 , is shown in 
Figure 12.71. The prototype is packaged in a 44-pin CLCC package and the 
measurements are performed using a test board made of 2-layer PCB and having 
a socket for the chip. 

The operation of the circuit is verified with a supply voltage range from 0.95 V 
to 1 .6 V. All the results given were obtained with a 1 .0- V supply unless otherwise 
specified. The reference voltages are set 600 mV apart and symmetrically 
between the supply rails, resulting a ±600-mV differential input signal range. 

The measured DNL and INL curves are presented in Figure 12.72. For 
nine measured samples the maximum DNL and INL are 0.6 LSB and 1 . 1 LSB 
respectively. The SNDR versus the input signal amplitude measured at 1 .0-V 
and 1.5-V supply voltages and 5-MHz and 14-MHz respective clock rates is 
shown in Figure 12.73. In both cases the peak SNDR is obtained with the 
full-scale signal amplitude and has a value of 50.0 dB, which corresponds to 
8.0 effective bits. 



I 





Figure 12.71. Photograph of the prototype chip. 
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Differential linearity 




Figure 12.72. Measured DNL and INL. 



Figure 12.74 shows a spectrum where a 1. 5-MHz full-scale sine wave is 
sampled at 5.0 MS/s, the supply voltage being 1.0 V. A 14-MS/s spectrum 
obtained with the 1 .5-V supply is presented in Figure 1 2.75. With the low supply 
voltage the spurious free dynamic range is limited by the second harmonic and 
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SNDR vs. Signal Amplitude 




Input amplitude [% of full scale] 



Figure 12.73. SNDR versus signal amplitude. At the 1.0-V supply voltage the clock rate is 
5 MHz and the signal frequency 200 kHz. The 1.5-V curve is obtained with a 14-MHz clock 
and a 1.5 -MHz signal. 



POWER SPECTRUM 




FREQUENCY [MHz] 

Figure 12. 74. A spectrum measured at a 5-MHz clock rate using a 1 ,0-V supply voltage. The 
signal is a 1 .5-MHz full-scale sine wave. 



Prototypes and Experimental Results 



229 



2 

i 

i 



I 



POWER SPECTRUM 




FREQUENCY [MHz] 

Figure 12. 75. A spectrum measured at a 14-MHz clock rate using a 1 .5-V supply voltage. The 
signal is a 5. 1-MHz full-scale sine wave. 



it is 59.5 dB. The level of the harmonic is raised as the bias current is increased 
along with the supply voltage, reducing the SFDR to 55.3 dB at 1.5 V. 

The ADC performance is limited by the static nonlinearities, since the SNDR 
is virtually constant up to a 6-MHz clock frequency, after which there is a sudden 
collapse in the signal quality, indicating timing problems in the digital circuitry. 
When the supply voltage is increased to 1.5 volts the same phenomenon is 
observed at the 14.5-MHz clock frequency. The power consumption from a 
1 .0-V supply at the 5.0-MS/s sampling rate is 1.6 mW. In addition to increasing 
the supply voltage to 1 .5 V, the bias current has to be tripled to achieve the 
14-MS/s sampling rate. This increases the power consumption to 8.2 mW, 
which is still very low compared to the 36 mW achieved in [102] with the same 
supply voltage and sampling rate. The measured performance is summarized 
in Table 12.4. 

The measurements demonstrate that the SO technique, previously considered 
an immature and performance-limited technique, can be used to realize low- 
voltage ADCs and provides a performance well-matched to that found with 
other low-voltage realizations. 
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Supply voltage 


1.0 V 


1.5 V 


Resolution 


9 bits 




Conversion rate 


5 MS/s 


14 MS/s 


DNL 


0.6 LSB 


0.6 LSB 


INL 


1.1 LSB 


1.1 LSB 


SNDR 


50.0 dB 


50.0 dB 


SFDR 


59.5 dB 


55.3 dB 


Input range 


±600 mV 


±600 mV 


Power consumption 


1.6 mW 


8.2 mW 


Technology 


0.5-^tm CMOS 


Active area 


1.2 mm 


2 



Table 12.4. Summarized performance of the ADC. 



Chapter 13 



CONCLUSIONS 



Technology scaling will bring advantages to SC circuits for the next few 
technology generations, but after that the effect of the decreasing supply voltage 
on the signal-to-noise ratio will start to dominate over the positive effects of 
shrinking transistor dimensions. Even before that, maximizing the signal range 
is essential for exploiting the benefits of technology scaling. The signal range 
has the largest effect on the opamps, making the use of a rail-to-rail output 
stage mandatory. The switches do not have major difficulties in adapting to the 
nominal supply voltage of the technology. On the contrary, when a smaller-than- 
nominal supply voltage is used or a performance increase is pursued, techniques 
such as gate voltage bootstrapping or the switched-opamp technique are needed 
to guarantee a small enough switch on-resistance. 

In this work the utilization of the switched-opamp technique in pipelined 
ADCs has been demonstrated. Connecting the SO circuit to the outside world 
has been solved with a passive input interface circuit. The main limitation 
of the SO technique is the low speed caused by opamp switching and the de- 
creased feedback factor. Thus, probably a better approach for low-voltage 
circuits requiring high speed is the selective use of bootstrapped switches and 
the utilization of different common mode signal levels at the opamp input and 
the output. The applications of the SO technique are in the areas where speed 
is not the most critical parameter. 

In wide-band radio receivers, moving the signal digitization into a high in- 
tei mediate frequency is attractive. It has been demonstrated that the sampling 
can be done with relatively high linearity by using bootstrapped switches. A 
more fundamental problem is jitter, which can be alleviated only by extensive 
oversampling. 

The elimination of the bulk effect and signal-dependent drain and source 
junction capacitances is essential in a highly linear bootstrapped switch. This 
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can most easily be accomplished with a triple-well process. The same, although 
with a smaller linearity boost, has here been demonstrated with a standard 
CMOS technology. 

Time interleaving is a way to extend the conversion rates of ADCs beyond 
the technology-determined limits of one stand-alone A/D channel. For elim- 
inating the non-uniform sampling effects caused by timing skew between the 
parallel channels, a front-end S/H circuit is practically mandatory. As a result, it 
becomes a speed bottleneck, typically limiting the number of parallel channels 
to a maximum of four. 

Moore’s law can be most effectively exploited by doing things digitally. 
This can be realized on the system level by moving analog functions into the 
digital domain, but also inside the analog system blocks by using even extensive 
amounts of digital circuitry to correct and compensate for the imperfections of 
the analog circuitry. In ADCs this means incorporating logic and memory for 
calibrating and correcting the offsets and the component mismatch. 



Appendix A 

Derivation of OTA GBW Requirement 



The small signal model for an SC amplifier in hold mode is shown in Figure A.l , where the 
OTA is represented with a single pole model consisting of <? m , g Q , and Cl. The capacitance Cl 
is the sum of the OTA output capacitance and the external load capacitance. The capacitance at 
the OTA input is represented with Ci n and the output conductance with g Q - 

This circuit is used to find out the settling time constant using a pulsed current source U as 
the excitation signal. The small signal analysis yields the following transfer function: 

Vo Qrn sC F /a i \ 

ii s \(gmCF + g 0 Ci,tot + poCfr) + s{Ci,totC L + ClCf + CFCi,tot)\ 5 

where Ci,tot = Cs + C* n . The output voltage for a current impulse with total integrated charge 
Q can be written (in partial fraction form) as 




Figure A. /. Small signal model for SC amplifier in hold mode. 
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The corresponding time domain expression is obtained with the inverse Laplace transform, giving 



Cf + + 



9m Cp+g 0 C j tot +9 qC F 
1 Ci' tot V F+ c F c 'i,tot. 



iC jr +goC i ,tot + 9 oC F 



Ci^totC l + ClCf + C F Ci,tot 



. e ~ Ci^tCL+CLCp+CFCi^t 1 ^ (A.3) 



from where the settling time constant can be identified as 

_ Cj^totC l + Cl Cf + CpCj^ot ^ 

QmC F + 9oCi,tot + 9oC F 

Since p m is large compared to g 0 the time constant can be approximated with 

^ Cj,totCL + ClCf + CFCj,tot ( 

9m,C F 

A well-known relation states that the gain-bandwidth product of a single-pole OTA is 



Substituting this to (A.5) yields 



GBW = &: 



1 ( ^ , Cj,tot{C l + C F ) 

l -t T 77 77 



2irGBW 



When the last term resulting from the feed-forward path is ignored in (A.3), the output settles to 
JV-bit accuracy in the time period T if 

e -?<2-", (A. 8) 



which leads to the following requirement for the GBW : 



A r In 2 • 1 + 



GBW > 



Cj.tot (Cl+C f ) 



The GBW can also be expressed in terms of the feedback factor / — Cf/ (Cf + Ci.tot) and 
the effective load capacitance Ci,tot = Cl + CFCi,tot/(CF + Ci,tot) [239], resulting in 



N\n2C L ,tot 

Gmv > ^fJcT - 



(A. 10) 



Appendix B 

Optimum Input Capacitance 



In this appendix an optimum value, which minimizes the settling time of an SC amplifier, will 
be derived for OTA input transistor gate capacitance. In strong inversion the transconductance 
of a MOS transistor is given by 

Id^Cox.W /q i \ 

9m = y — i — - (B1) 

Using the gate capacitance Cg = C ox WL , it can be rewritten as 

g m = (B.2) 

The capacitance Ci n in Appendix A is now Co- Again, from Appendix A the settling time 
constant (A. 7) becomes 



/Id^Cg 



( Cs + Cg){Cl + Cf) 



Finding the minimum time constant yields the optimum gate capacitance, which is given by 

C G ,o P t= +Cs- (B.4) 

When the same analysis is repeated, assuming that the transistor is biased in the weak inver- 
sion, where the transconductance is given by 

9 m = ———/o, (B.5) 

Q 

the settling time constant becomes 



gC L ( (C s + Cg)(Cl + Cf) \ 
nkTI D \ C f Cl I 



1 



Now there is no optimum gate capacitance, but the settling time increases with Cg (i.e. with W 
when L is fixed). 
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LCl ( , . ( Cs + Cg)(Cl + CfA ('P 

T ~v sat C G '\ C f Cl )' 

Again, there is no optimum gate capacitance. As opposed to the weak inversion case, the settling 
time now decreases with increasing Cg (he with increasing W when L is fixed). 





Appendix C 
Saturation Voltage 



This appendix will study how the opamp output transistor saturation voltage Vdsat can be 
scaled with respect to Vdd • It is assumed that Vdsat = an d thus the task is to find 

expressions for the constants m and ka. 

Another assumption is that the transistor current is determined by the slew-rate requirement, 
and thus Id — Isr< The opamp employs the Miller topology, where the second pole is deter- 
mined by the output transistor and is given by 

■ i ' 2 = “ Cl(Cc+Ci) ’ (C ‘ 1 ) 

where Cc is the compensation capacitance, Cl the load capacitance, and Cl the first stage output 
capacitance (including the second st^ge input capacitance). All the capacitances are assumed 
to be linearly proportional to the sampling capacitor C and thus p 2 oc —g ni /C. To prevent the 
opamp from losing speed, the magnitude of p 2 has to be constant or increasing as Vdd is scaled 
down. The pole is given by T ■ 

Q m j_ 2 I D 2 Isr (C2) 

C (Vgs-Vt)C VdsatC' 



Substituting Isr from (2.8) yields 



( Vd D V margin ) 



Vdd 

Vdd 



Since Vmargin is proportional to Vdsat , which makes the last term constant, the pole frequency 
will not decrease if m > 1. 

Another important thing is the second stage input capacitance C\. It is not allowed to grow 
faster than the compensation capacitance, i.e. not faster than C. It is assumed that C\ is 
dominated by the transistor gate capacitance. From the transistor current equation we can get 



C G = C ox W L cx 



{Vdd — Vm 



h,2\/2m 
^6 v DD 
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For a given technology (fixed L) m must be 0.5 or less, which conflicts with the earlier require- 
ment. If L scales linearly with the supply voltage, then m< 1.5, which, together with the earlier 
requirement, results in 1 < m < 1.5. 

So, it was wrong to assume that in the case of a fixed technology the output transistor size 
and current are always constrained by the slew rate. Writing the expression for the second pole 
again yields 

g m 2 flC ox W (VgS ~ V T ) _ 2 ll CGVdsat zp 

~C ~ LC L 2 C ’ ^ ^ 

If Cg is now allowed to grow as fast as possible by making it proportional to C, then Vdsat will 
be constant and the current will need to be increased faster than required by the slew rate. 

The above analysis is made for a Miller opamp, but the results are applicable to other opamp 
topologies as well. For example, in the folded cascode opamp the non-dominant pole is deter- 
mined by the g m of the cascode transistor and parasitic capacitances that are proportional to the 
gate capacitance. Thus, an analysis of it would yield similar results. 
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